Closed jason-gill closed 8 years ago
Hi, thanks for report. Default setting expects CSV in following format https://raw.githubusercontent.com/keboola/php-csv/master/tests/Keboola/Csv/_data/escaping.csv according to https://tools.ietf.org/html/rfc4180 There is no special escape character but only one simple rule:
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
So your example should look like:
"2016-05-12T08:49:56Z","5348465256756450422","Mozilla/5.0 (Linux; Android 5.0.1; Alba 7"" Tablet Build/LRX22C; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/46.0.2490.76 Safari/537.36","RO","Android","False"
Thanks for the quick reply. I find it interesting that the docs for fgetcsv shows the escape character is a backslash but there is a RFC that suggests it should be a quote.
Also the example data I provided is coming from a third party, which I have no control over.
I guess for my case the best option is just to provide my own escape character in the constructor.
fgetcsv by default passes in "\" for the escape character. The current version of php-csv overwrites this with "". This creates parsing errors and unexpected behavior.
Here is a sample CSV that fails to parse correctly
"2016-05-12T08:49:56Z","5348465256756450422","Mozilla/5.0 (Linux; Android 5.0.1; Alba 7\" Tablet Build/LRX22C; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/46.0.2490.76 Safari/537.36","RO","Android","False"
Here is the code to prove it: