Closed baldwinkl closed 7 years ago
I know for a fact that FAOSTAT data has an encoding issue with that acute-accented 'e'. I wrote an email to them a while ago which detailed the issue:
When using the bulk downloads, the 'é' is latin-1 encoded which is different to the UTF-8 encoding found when downloading an extract from the data.
This may seem a like something really minor, but I've had a couple of scripts choke on the character encoding issue on Maté specifically because they were expecting UTF-8.
To see this issue, try looking at the Trade data and try to download [Argentina, Maté, Import Quantity, 2000] from the selection interface. Then bulk download from Oceania (it has the smallest file size). Compare how 'Maté' appears in both. If you're using a program that can tell the difference, you'll note that it appears fine in the data downloaded from the selection, but shows an invalid character instead of the 'é', so 'Mat�'.
I recommend changing the bulk download to use the Unicode version (0xc3a9) rather than the latin-1 version (0xe9).
Long story short, different ways of displaying text leads to programs getting confused about what that text is.
I'm reopening this as I left out "Extracts, essences and concentrates of tea or mate, and preparations with a basis thereof or with a basis of tea or mat�".
The fix (manual, again) is more general now:
When I dry to draw plots to see mate, mate extracts, etc, I get errors.