parsecsv / parsecsv-for-php

CSV data parser for PHP.
MIT License
681 stars 176 forks source link

Enclosed cell followed by space then delimiter causes syntax error #210

Closed fivebillionmph closed 2 months ago

fivebillionmph commented 2 years ago

The following CSV (note the space before the comma after the first element in each row) results in an error when parsed with the following script.

"Name" ,"Address", "Phone"
John ,"123", "456"
$filename = "files-test/space-before-comma.csv";
$csv = new parseCSV();
$csv->delimiter = ",";
$csv->parseFile($filename);

The error I get when parsing is: Syntax error found on row 1. Non-enclosed fields can not contain double-quotes. This also results in the two cells getting concatenated together like this: Name"Address" This only is an issue when the cell preceding the delimiter is enclosed in quotes. The second line of the attached CSV does not have a syntax error and the two cells are not concatenated together. Should the first line be parsed the same way as the second line?

Example CSV file: space-before-comma.csv

Thanks for you help

jimeh commented 2 years ago

Technically, your example is not valid CSV, but I believe it should be handled more gracefully with a warning, rather than an outright error.

A few years ago I put together a draft spec for csv-spec.org. Clause 9 is relevant:

  1. When a field enclosed in double quotes has spaces before and/or after the double quotes, the spaces MUST be ignored, as the field starts and ends with the double quotes. However this is considered invalid formatting and the CSV parser SHOULD report some form of warning message.
gogowitsch commented 2 months ago

I close this issue now, as parsing broken files seems a bit slippery-slopish.