Closed yhager closed 7 years ago
Hi,
thanks for the bug report.
I agree that the library should SegFault. However, is
"unescaped "quotes"","1","2"
not illformed and an exception should be thrown?
Best Regards Ben Strasser
On 11/16/2016 04:19 PM, Yuval Hager wrote:
When using double_quotes_escape, an unescaped quote can lead to an overrun if it appears at the end of the string.
For example, with the following code(test.cpp):
|#include
#include "csv.h" int main(){ io::CSVReader<3, io::trim_chars<' ', '\t'>, io::double_quote_escape<',', '\"'>> in("test.csv"); in.read_header(io::ignore_extra_column, "vendor", "size", "speed"); std::string vendor; int size; double speed; while(in.read_row(vendor, size, speed)){ std::cout << "Vendor: " << vendor << ", size: " << size <<", speed: " << speed << std::endl; // do stuff with the data } } | And the following csv file(test.csv):
|"vendor","size","speed" "escaped ""quotes""","1","2" "unescaped "quotes"","1","2" |
I get:
|$ clang++ -std=c++11 test.cpp -lpthread -o test && ./test Vendor: escaped "quotes", size: 1, speed: 2 Segmentation fault (core dumped) |
With the proposed fix:
|$ clang++ -std=c++11 test.cpp -lpthread -o test && ./test Vendor: escaped "quotes", size: 1, speed: 2 Vendor: unescaped "quotes", size: 1, speed: 2 |
You can view, comment on, or merge this pull request online at:
https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32
Commit Summary
do not overrun when quotes are unescaped
File Changes
M csv.h https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32/files#diff-0 (2)
Patch Links:
- https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32.patch
- https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32.diff
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32, or mute the thread https://github.com/notifications/unsubscribe-auth/ALaAj6Cnfv6tMG8LycGLLmg4NFWWIuzqks5q-x8KgaJpZM4Kz_NK.
Possibly. But the fix was so easy, so might as well let it through (unless you see a problem with the code)
Hi,
the problem is that if I accept the patch as is now and change the behavior later to throwing an exception, then the later change will be a breaking change. :/
However, as I already said, a seg fault is not acceptable. Sometime has to change. I'm just not sure what yet.
Best Regards Ben
On 11/17/2016 01:45 PM, Yuval Hager wrote:
Possibly. But the fix was so easy, so might as well let it through (unless you see a problem with the code)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ben-strasser/fast-cpp-csv-parser/pull/32#issuecomment-261237874, or mute the thread https://github.com/notifications/unsubscribe-auth/ALaAjzMUiuq4C_eaDPQqxXtVJdUIDPXSks5q_Ex3gaJpZM4Kz_NK.
That's correct. Let me know if you want me to try and modify the patch so that it throws an exception instead. For my case, I need to keep parsing other lines, even if the current line is ill formed.
Hi,
I merged the patch.
The reason is that currently strings such as "aa"aa" is read as aa"aa even though the quotation mark is not escaped. One can argue that this should throw an exception. However, it currently does not and throwing an exception would therefore potentially break existing code. Throwing an exception is therefore of the table and your patch is the only remaining solution that I can see.
Best Regards & Thanks for the patch Ben Strasser
When using double_quotes_escape, an unescaped quote can lead to an overrun if it appears at the end of the string.
For example, with the following code(test.cpp):
And the following csv file(test.csv):
I get:
With the proposed fix: