FlatFilers / csvjson-app

Online conversion and formatting tools for JSON, CSV and SQL.
https://www.csvjson.com
MIT License
438 stars 110 forks source link

unicode characters on downloaded CSV (utf-8 not supported) #78

Closed elbaza1 closed 5 years ago

elbaza1 commented 5 years ago

Link of the test case :: https://www.csvjson.com/json2csv/df61580582fea1929d2c1ba50f5cfb8e

French Characters like 'é' are converted to 'é' for example. I suggest allowing 'utf-8' on download, or writing the csv files as 'utf-8' in the api before downloading Thanks

DrorHarari commented 5 years ago

The download is in UTF-8. The reason you see 'é' is because the program you use (likely Excel or Wordpad) does not automatically recognize the data is in UTF-8 so the bytes 0xC3, 0xA9 are interpreted as two characters rather than the single character 'é'. Some other programs such as Notepad actually do recognize the file as UTF-8 and show the data fine.

You can ask why not automatically insert at the top of the file the BOM character (0xEF, 0xBB, 0xBF) which will make Excel, Wordpad and many other software products recognize the file as UTF-8?

The answer is that many CSV file processors actually balk at the BOM sequence at the beginning of the file and include it in the data where it looks like a garbage character. That is not universally the case and that is why, it may be useful to have another checkbox "Include BOM" to let the user ask for a BOM to be added.

image

martindrapeau commented 5 years ago

I've gone ahead and added the BOM character at the beginning as I believe most people use this to work in Excel afterwards. If this becomes an issue, I can add the option.

elbaza1 commented 5 years ago

Thank you @DrorHarari @martindrapeau .

jmappala commented 5 years ago

HI.... I am having an error now. It has been working more than week ago.

I saved an iNav SQL data extract into CSV, then used CVSJSON to convert into JSON.

Run K6 to do praalel run, and got this: level=error msg="SyntaxError: invalid character 'ï' looking for beginning of value at parse (native)"

jmappala commented 5 years ago

FYI.... I used http://www.convertcsv.com/csv-to-json.htm, and it worked.

martindrapeau commented 5 years ago

Could you try again @jmappala? Should be fixed now.

DrorHarari commented 5 years ago

Adding the BOM automatically can break apps that do not expect it - the BOM would be seen as the first character of the first column name. If the app expects a specific column name, it would not find it (unless it expects the BOM). Hence while adding the BOM by default for the expected Excel audience, it may be safer to allow the user to request a CSV without it (via that checkbox). @martindrapeau - I understand you are waiting for that hypothetical person to stand up 😉, ok.

jmappala commented 5 years ago

@martindrapeau, and it worked. Thanks.