Closed heldersilva closed 11 years ago
I encountered the same issue
Same issue here as well. smarter_csv breaks on any fields that contain commas that are within quotes in a given field.
yes, that is a bug! thank you for letting me know!
Just ran into this problem now, tried various combinations of formatting the csv file and changing the field separator for example :col_sep => "\t"
Any fix or patch available ?
I'm looking into it.
Note to self:
quote_char = '"' => """ get_quoted_strings = %r{#{quotechar}((?:.|.)?)#{quotechar}} => "((?:.|.)?)"
quoted_strings = line.scan(get_quoted_strings).flatten
Tilo, I don't know if it helps, but I think looking to the LibreOffice source code might help you come up with a regular expression to deal with this.
I'll also try to look into it, but I have limited time at the moment =/
I was having so much fun until I hit this wall. Pretty nasty bug! Any progress?
yes,i made some progress ... I'll update this weekend.
Hate to rush you :)
Sent you a PR on that commit...
@digerata unfortunately the fix does only work if all fields are quoted - but as per CSV specification, fields can be mixed - some quoted, some not quoted - this makes it a bit more difficult
Maybe that is useful but I'm not smart enough to figure out why. How old is that spec? What modern app does that these days? I can tell you right now that I'm seeing commas in quoted fields 100% more than I'm seeing mixed quoted fields.
You might as well include the patch until you figure out a solution to the spec because right now SmarterCSV does not work for a very common use case.
On May 17, 2013, at 4:04 AM, Tilo notifications@github.com wrote:
@digerata unfortunately the fix does only work if all fields are quoted - but as per CSV specification, fields can be mixed - some quoted, some not quoted - this makes it a bit more difficult
— Reply to this email directly or view it on GitHub.
apparently some versions of Excel quote only fields which need to be quoted -- and leaves other fields unquoted.
I agree with you. Let's just add your fix for now, and deal with the other case later.
fixed in 1.0.6
Thanks Tilo =D
I think the line 72 of smarter_csv.rb is wrong. If you have a separator inside a quoted text it will split the text incorrectly.
Maybe I'm seeing this wrong, but check the example below and see if what I'm saying makes any sense.
value, text, other_value 1,"sffdfd, sdfdfsf",3
the resulting Hash is going to be:
value => 1, text =>"\"sffdfd\"", other_value => "\"sdfdfsf\""
You can use this test using the default options.