davidsantiago / clojure-csv

A library for reading and writing CSV files from Clojure
187 stars 35 forks source link

Parse errors in values with double quotes #3

Closed kumarshantanu closed 13 years ago

kumarshantanu commented 13 years ago

Hi,

I noticed this bug in 1.2.0 and 1.2.1 versions. Below is the actual scenario:

Sample row (with semicolon delimiter): 120030;BLACK COD FILET MET VEL "MSC";KG;0;1;Nee;23-03-09;1070;BTWLAAG;STUK;1,06;33,1;VERS;Ja;Nee;1000

Expected: ["120030" "BLACK COD FILET MET VEL \"MSC\"" "KG" "0" "1" "Nee" "23-03-09" "1070" "BTWLAAG" "STUK" "1,06" "33,1" "VERS" "Ja" "Nee" "1000"]

Actual: ["120030" "BLACK COD FILET MET VEL \"SC\"KG" "0" "1" "Nee" "23-03-09" "1070" "BTWLAAG" "STUK" "1,06" "33,1" "VERS" "Ja" "Nee" "1000"]

Regards, Shantanu

davidsantiago commented 13 years ago

I think this is not a bug in the parsing... The file is malformed. In csv, a field that has a comma, newline, or double quote must itself be quoted. That doesn't appear to be the case in your input.

David

davidsantiago commented 13 years ago

Actually, I went to go make an option to let this parse correctly when strict is off, and I saw that there's code already there for that. Looks like there is a bug there.

davidsantiago commented 13 years ago

OK, I checked in a fix for this and added a test based on your example for the future.

Also, note that although that'll parse now, those quotes do not turn quoting on, they are just the literal quotes. I understand that you probably don't control the file you're trying to read in, but if it's not according to spec, it's not clear exactly how to handle things either way, so you're just kind of living dangerously any way you go.

Thanks for the report, let me know if you have any further problems.

kumarshantanu commented 13 years ago

It works fine now. Thanks!