ben-strasser / fast-cpp-csv-parser

fast-cpp-csv-parser
BSD 3-Clause "New" or "Revised" License
2.11k stars 440 forks source link

What do I need to Change? #81

Closed Icesythe7 closed 4 years ago

Icesythe7 commented 5 years ago

I have the following csv file https://pastebin.com/ZUKBNEr7

I am currently parsing it like so io::CSVReader<2, io::trim_chars<' ', '\t'>, io::double_quote_escape<',', '"'>> in("mount30993.csv"); in.read_header(io::ignore_extra_column, "Name_lang"); std::string Name_lang; while (in.read_row(Name_lang)) { std::cout << Name_lang << "\n"; }

however it crashes after printing the name of line 373, now the file is actually much larger and through alot of trial and error i can manually fix this by fixing the newlines after this however this is cumbersome and this file updates weekly and to fix all the new lines once a week is rather exhausting...is there anything i could change in the source code or option I can use so this will simply read this file without me having to alter it?

ben-strasser commented 5 years ago

Hi,

when I run your program (after changing the 2 column parameter to a 1) it fails with the following message:

terminate called after throwing an instance of 'io::error::escaped_string_not_closed' what(): Escaped string was not closed in line 374 in file "my.csv"

The 374'th line is

"Yu'lei, Daughter of Jade","|cFFFFD200Vendor: |rMistweaver Xia

where indeed a string is not closed.

It seems like you are escaping newlines using |n. However, at this spot in the file a few newlines were as far as I can tell not escaped.

Best Regards Dr. Ben Strasser

On 7/7/19 17:42, Icesythe7 wrote:

I have the following csv file https://pastebin.com/ZUKBNEr7

I am currently parsing it like so |io::CSVReader<2, io::trim_chars<' ', '\t'>, io::double_quote_escape<',', '"'>> in("mount30993.csv"); in.read_header(io::ignore_extra_column, "Name_lang"); std::string Name_lang; while (in.read_row(Name_lang)) { std::cout << Name_lang << "\n"; }|

however it crashes after printing the name of line 373, now the file is actually much larger and through alot of trial and error i can manually fix this by fixing the newlines after this however this is cumbersome and this file updates weekly and to fix all the new lines once a week is rather exhausting...is there anything i could change in the source code or option I can use so this will simply read this file without me having to alter it?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ben-strasser/fast-cpp-csv-parser/issues/81?email_source=notifications&email_token=AC3IBD2EM36KGW2ITNKNH6LP6IFG3A5CNFSM4H6V7HOKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G5WZ53Q, or mute the thread https://github.com/notifications/unsubscribe-auth/AC3IBD3WKSRHXVXCI5ZOVSLP6IFG3ANCNFSM4H6V7HOA.

Icesythe7 commented 5 years ago

Yes and my question is, is there anything I can change in the source code of the parser to allow it to not error here and continue as normal, the lua parser here https://github.com/geoffleyland/lua-csv has no issue with the way this file is and I am trying to find a c++ parser that can also read it without me having to modify the file. https://wow.tools/dbc/?dbc=mount&build=8.2.0.30993#search=&page=1 the download csv button here is where this file is generated btw if you wanted to test the entire file, also ty for replying so fast and taking the time to test this out. oh also the |n is not an escape sequence for column 2 for example the string there is literally |cFFFFD200Vendor: |rMilli Featherwhistle|n|cFFFFD200Zone: |rDun Morogh|n|cFFFFD200Cost: |r10|TINTERFACE\MONEYFRAME\UI-GOLDICON.BLP:0|t its an ingame link for a game, I pasted your response to the owner of that website since his website generates the csv and he said the following

|n is just wow tooltip stuff \n's inside of strings are fine in theory but most parsers wont accept it