Closed bryanboettcher closed 1 month ago
@bryanboettcher thanks again for the nice repo and filing this issue. This behavior is by design. If you look in README at https://github.com/nietras/Sep?tab=readme-ov-file#unescaping this is covered by the "a"·
case, and is identical to CsvHelper
behavior if turning errors off. Note also that Sylvan simply throws in that case. The column is invalidly defined with that space after quote, so there is no true answer here.
Input | Valid | CsvHelper | CsvHelper¹ | Sylvan | Sep² |
---|---|---|---|---|---|
"a"· |
False | EXCEPTION | a· |
EXCEPTION | a· |
Trimming before unescaping could help, and trimming is an issue filed in #74 but trimming has issues since some want trimming before unescaping, some want after unescaping, some want both etc. And yes, trimming will likely impact perf.
Would trimming solve your issue here? and what kind of trimming would you prefer? How would you want option for it to look?
Trimming would solve my issue here. My expectation is that if I have quote parsing and escaping on, that everything between the quotes would be considered the column name, regardless of whatever whitespace is outside the quotes.
What a mess. I don't envy you having to solve that. (but I'd like for you to! 😍)
Closing as tracked by #74 :)
Hi @nietras, me again 👋
We have a CSV with the following headers:
"FIRSTNAME","LASTNAME","ADDRESS","CITY","STATE","ZIP"
Look very carefully -- the row has a single space after the closing quote after "ZIP". When parsing, the header for the last column is named
ZIP
, with the trailing space. IMO, this behavior is incorrect, since the column value was correctly enclosed in quotes.Minimal repro with output:
If you try and run this repro, make sure the FileData const has a single space after the closing quote on "ZIP". I'm not sure if GitHub markdown will strip that out or not.