adaltas / node-csv-parse

CSV parsing implementing the Node.js `stream.Transform` API
https://csv.js.org/parse/
804 stars 166 forks source link

Tab delimiter cell with double quotes #68

Closed afreeland closed 5 years ago

afreeland commented 8 years ago

Working with the tab delimiter '\t' and running across an issue on cells that contain double quotes.

Example of data

product_sku product_name      brand
3355574001  Plush Dodgers "Puig"    Rally Men

In this example the Plush Dodgers "Puig" does not begin with double quotes, nor does it escape out the tab character. The parser throws a Error: Invalid opening quote at line 9..Is this a bug or is this as desired since the entire cell is not escaped?

Now it does work perfectly fine if the data looks like

product_sku product_name      brand
3355574001  "Plush Dodgers ""Puig"""    Rally Men

I kind of feel that it may not be a bug as much as improper CSV data, any thoughts?

hbakhtiyor commented 8 years ago

i've the same issue @wdavidw any ideas?

wdavidw commented 8 years ago

definitely not standart csv data but there is no official csv spec. i would suggest writing a test case illustrating the issue and then trying to handle such a case. this might not be an easy task.

hbakhtiyor commented 8 years ago

how about using relax option?

wdavidw commented 8 years ago

it could work, i said could because there could be some bug associated to relax.

kathy-ems commented 6 years ago

I'm having this same issue except the file is comma delimitated. Error message is Error: Invalid closing quote at line 50534; found "C" instead of delimiter "," and the line it's crashing on is "SGP","1"CHROMATOGRAPHY 7701","#","SAP","06/01/2018". This is happening because a user put in a double quote instead of writing out inches.

I thought this error wouldn't occur because of CSV Parser's functionality, which escapes double quotes inside the string escape (char) Set the escape character. One character only. Defaults to double quote.

Adding {relax: true} removed the error (relax (boolean) Preserve quotes inside unquoted field (be warned, it doesn't make coffee).), but many threads, when researching this error, say relax is buggy so I'm concerned about using it.

wdavidw commented 6 years ago

Honestly, I'm questionning how we could support a cell like "1"CHROMATOGRAPHY 7701" which is in complete contradiction with the CSV format. Feel free to report issue with relax, a lot of issue related to this options is that people were expecting much more than its original purpose.

kathy-ems commented 6 years ago

@wdavidw Thank you for the information.

wdavidw commented 5 years ago

Closing due to lack of activity. Feel free to open a new issue to initiate a new discussion around this topic.