Closed pyvandenbussche closed 10 years ago
What do you expect to happen and what do you see happening?
**Input line: "Kips Bay Medical, Inc.",0001460198,3841,AYOKZPWQTYDPHIZQ4N06,KIPS
**expected split: Kips Bay Medical, Inc. 0001460198 3841 AYOKZPWQTYDPHIZQ4N06 KIPS
**current split: Kips Bay Medical Inc.",0001460198,3841,AYOKZPWQTYDPHIZQ4N06,KIPS [...until you have another starting quote...]
Not sure to understand the RFC http://tools.ietf.org/html/rfc4180
I see three explanations: 1- When you start using quotes to escape field content, you have to use quotes systematically for all fields 2- We should have a double quotes to escape properly a field 3- The program don't handle properly the quote escaping
I agree with your expected split. So it seems we have your explanation number three. Which is weird, as we use a mature off-the-shelf CSV parser, and I would expect it to handle this correctly.
I will dig into this when I'm back in the office (Friday).
I cannot reproduce this.
This is test.csv
:
"Kips Bay Medical, Inc.",0001460198,3841,AYOKZPWQTYDPHIZQ4N06,KIPS
This is test.sparql
:
CONSTRUCT {}
FROM <test.csv>
WHERE {}
This is the result of running bin/tarql --test test.sparql
:
{ }
--------------------------------------------------------------------------------------
| a | b | c | d | e |
======================================================================================
| "Kips Bay Medical, Inc." | "0001460198" | "3841" | "AYOKZPWQTYDPHIZQ4N06" | "KIPS" |
--------------------------------------------------------------------------------------
Can you share the actual file and mapping with me (via email)?
Can't reproduce.
For instance, the following example: "Kips Bay Medical, Inc.",0001460198,3841,AYOKZPWQTYDPHIZQ4N06,KIPS Will cause some troubles.