Open zioth opened 6 years ago
Correction: The file was exported by MacOS Numbers, not Excel.
@zioth would you be able to share an example file here with a few problem lines and personal information removed?
Does JavaScript support characters that are more than 2 bytes? https://stackoverflow.com/questions/2219526/how-many-bytes-in-a-javascript-string
In order for PapaParse to handle characters past U+FFFF, I think it would have to manually consider all four individual bytes of each UTF32 character.
Someone gave me a UTF32 CSV exported from Excel. The export included lots of embedded nulls, which are normal in UTF32, but which PapaParse didn't strip out. It also had a URL with a leading newline. The value was quoted in the export. PapaParse didn't strip the quotes.
The same string (with the quote-newline-url-quote pattern) works fine in ASCII. It only fails in UTF32. This is true whether I set the "encoding" option or not.