mexicovid19 / Mexico-datos

Datos de la pandemia del COVID19 en Mexico
https://mexicovid19.github.io/
MIT License
23 stars 14 forks source link

Encoding check in bash is not enough #19

Closed rodrigolece closed 4 years ago

rodrigolece commented 4 years ago

The files published in recent days have been successfully converted using the file check in the bash script, but the file of the 27th of April was detected as utf-8 and still had the wrong encoding that crashed pandas.read_csv.

rodrigolece commented 4 years ago

Fixed in 0bf8552. The problem is a mix of encoding so iconv is not the tool that can fix the files. We rely on a perl script that correctly converts all of latin-1 into UTF-8.