Closed elbaza1 closed 5 years ago
The download is in UTF-8. The reason you see 'é' is because the program you use (likely Excel or Wordpad) does not automatically recognize the data is in UTF-8 so the bytes 0xC3, 0xA9 are interpreted as two characters rather than the single character 'é'. Some other programs such as Notepad actually do recognize the file as UTF-8 and show the data fine.
You can ask why not automatically insert at the top of the file the BOM character (0xEF, 0xBB, 0xBF) which will make Excel, Wordpad and many other software products recognize the file as UTF-8?
The answer is that many CSV file processors actually balk at the BOM sequence at the beginning of the file and include it in the data where it looks like a garbage character. That is not universally the case and that is why, it may be useful to have another checkbox "Include BOM" to let the user ask for a BOM to be added.
I've gone ahead and added the BOM character at the beginning as I believe most people use this to work in Excel afterwards. If this becomes an issue, I can add the option.
Thank you @DrorHarari @martindrapeau .
HI.... I am having an error now. It has been working more than week ago.
I saved an iNav SQL data extract into CSV, then used CVSJSON to convert into JSON.
Run K6 to do praalel run, and got this: level=error msg="SyntaxError: invalid character 'ï' looking for beginning of value at parse (native)"
FYI.... I used http://www.convertcsv.com/csv-to-json.htm, and it worked.
Could you try again @jmappala? Should be fixed now.
Adding the BOM automatically can break apps that do not expect it - the BOM would be seen as the first character of the first column name. If the app expects a specific column name, it would not find it (unless it expects the BOM). Hence while adding the BOM by default for the expected Excel audience, it may be safer to allow the user to request a CSV without it (via that checkbox). @martindrapeau - I understand you are waiting for that hypothetical person to stand up 😉, ok.
@martindrapeau, and it worked. Thanks.
Link of the test case :: https://www.csvjson.com/json2csv/df61580582fea1929d2c1ba50f5cfb8e
French Characters like 'é' are converted to 'é' for example. I suggest allowing 'utf-8' on download, or writing the csv files as 'utf-8' in the api before downloading Thanks