Tool will not draw plots for mate, mate extracts etc. due to bug in product name

baldwinkl commented 7 years ago

When I dry to draw plots to see mate, mate extracts, etc, I get errors.

sebastian-c commented 7 years ago

I know for a fact that FAOSTAT data has an encoding issue with that acute-accented 'e'. I wrote an email to them a while ago which detailed the issue:

When using the bulk downloads, the 'é' is latin-1 encoded which is different to the UTF-8 encoding found when downloading an extract from the data.

This may seem a like something really minor, but I've had a couple of scripts choke on the character encoding issue on Maté specifically because they were expecting UTF-8.

To see this issue, try looking at the Trade data and try to download [Argentina, Maté, Import Quantity, 2000] from the selection interface. Then bulk download from Oceania (it has the smallest file size). Compare how 'Maté' appears in both. If you're using a program that can tell the difference, you'll note that it appears fine in the data downloaded from the selection, but shows an invalid character instead of the 'é', so 'Mat�'.

I recommend changing the bulk download to use the Unicode version (0xc3a9) rather than the latin-1 version (0xe9).

Long story short, different ways of displaying text leads to programs getting confused about what that text is.

chrMongeau commented 7 years ago

Fixed (manually) in:

https://github.com/SWS-Methodology/faoswsTrade/blob/ea47cdf43b97deead7a72763138f258950fc8b91/modules/trade_validation_cpc/main.R#L490

chrMongeau commented 7 years ago

I'm reopening this as I left out "Extracts, essences and concentrates of tea or mate, and preparations with a basis thereof or with a basis of tea or mat�".

chrMongeau commented 7 years ago

The fix (manual, again) is more general now:

https://github.com/SWS-Methodology/faoswsTrade/blob/791086c95830b33e0a0d293e07e9b1c9b748a8b6/modules/trade_validation_cpc/main.R#L491

SWS-Methodology / tradeValidationTool

Tool will not draw plots for mate, mate extracts etc. due to bug in product name #9