Open IAmWhitBran opened 1 year ago
Puh, for a second you got me...
As in the original issue
When delimiter is
,
(with a space after comma), and there is,
or,
in the cell(with quotes), the read function will break the cell by,
or,
.
If you change the delimiter to ,
it works as expected.
Huh, wow, didn't realise that, even after reading the original issue...
Is there a fix for that that can be implemented at all?
While it does stop this happening, as a work around, it feels like there is still an issue here.
I'll add an option readOption_delimitersToGuess
where you can manually specify the delimiters that should be guessed.
If you add ,
(, + whitespace) before ,
in the list, it may work... at least it works for this example.
As alternative, would something like an "ignore unquoted whitespace" flag be possible?
I believe, in the context as a CSV, respecting whitespace does not matter, unless it is explicitly defined to be there by use of the quoted string. Having this as a flag would allow for it to still respect the current functionality, without users having to guess all the possible delimiters inserted into unclean data. Having this disabled by default would allow for full backwards compatibility unless set to true.
Actually, whitespace is handled in the csv rfc, see issue in papaparse and should not be ignored...
Such a flag might be interesting but there might be some implications on the parsing side. Maybe at some point.
I have added some reasons why this is not implemented at https://github.com/janisdd/vscode-edit-csv/blob/master/docs/quotes.md
Originally posted by @PixelKnot in https://github.com/janisdd/vscode-edit-csv/issues/58#issuecomment-1607777523 Added this as a comment to a closed issue, then figured it could probably do with being it's own new one.
I am still seeing this exact behaviour on v0.7.6
This csv has 3 columns with the following values x y,y z
The plugin is ignoring
"
as a quote character and readsx, "y,y", z
as having 4 columns.