fivethirtyeight / russian-troll-tweets

770 stars 215 forks source link

Non-ASCii Characters Issue #36

Open pabloneill opened 5 years ago

pabloneill commented 5 years ago

Hi all,

I have seen the posts re: double-encoded, however when I try:

for file in IRAhandle_tweets_1.csv; do
  echo -n "Converting $file... "
  iconv -f utf8 -t latin1 $file > $file.corrected &&
  mv -f $file.corrected $file
  echo "Done"
done

I get the following error : iconv: IRAhandle_tweets_1.csv:6:84: cannot convert

this also generates a file: IRAhandle_tweets_1.csv.corrected , which can not be opened by macbook (file size is only 2kb!)

Ultimately, I would like to export all the English Language tweets into a txt file...any suggestions kindly appreciated.