Closed bradkrane closed 2 years ago
Could you share a CSV file that reproduces this problem?
If I can do so privately somehow and trust you to destroy the information. Unfortunately the CSV file is full of personally identifiable information name, address, email, phone number, and some other info.
After some more testing it could be the UTF-8-BOM doing head file.csv > trunc.csv (as opposed to save as with NP++) the error is preserved I'll see if I can replace any personal information and get you a copy that has the error
Actually I was able to get failure with the first line alone using head -1 > trunc3.csv
from the OG file please see the attached.
Thanks. Could you also provide a Ruby script that reproduces this case?
Microsoft Windows [Version 10.0.19042.1586]
(c) Microsoft Corporation. All rights reserved.
C:\Users\Brad Krane\Documents\src\csv-quote>ruby -v
ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x64-mingw-ucrt]
C:\Users\Brad Krane\Documents\src\csv-quote>irb -v
irb 1.4.1 (2021-12-25)
C:\Users\Brad Krane\Documents\src\csv-quote>irb
irb(main):001:0> require 'csv'
=> true
irb(main):002:0>
irb(main):003:0> File.open('trunc3.CSV') { |f| CSV.parse(f, ) }
C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:955:in `parse_quotable_robust': Illegal quoting in line 1. (CSV::MalformedCSVError)
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:894:in `block in parse_quotable_loose'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:53:in `block in each_line'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:50:in `each_line'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:50:in `each_line'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:855:in `parse_quotable_loose'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv/parser.rb:338:in `parse'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv.rb:2365:in `each'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv.rb:2365:in `each'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv.rb:2400:in `to_a'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv.rb:2400:in `read'
from C:/Ruby31-x64/lib/ruby/3.1.0/csv.rb:1578:in `parse'
from (irb):3:in `block in <top (required)>'
from (irb):3:in `open'
from (irb):3:in `<main>'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>'
from C:/Ruby31-x64/bin/irb:33:in `load'
... 1 levels...
irb(main):004:0>
Thanks.
Could you try File.open('trunc3.CSV', encoding: "BOM|UTF-8") {|f| CSV.parse(f)}
?
Hi,
Thanks, it works as expected! I thought that I was probably making a mistake in the encoding. I tried encoding: 'UTF-8-BOM' and looked around to no avail. Thanks for the correct string vey much appreciated!
Cheers,
On Sun, Apr 17, 2022 at 8:13 PM Sutou Kouhei @.***> wrote:
Thanks.
Could you try File.open('trunc3.CSV', encoding: "BOM|UTF-8") {|f| CSV.parse(f)}?
— Reply to this email directly, view it on GitHub https://github.com/ruby/csv/issues/242#issuecomment-1100975185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKVXOPSM7KATLJEQJKRWLVFSSMBANCNFSM5SN3MRYA . You are receiving this because you authored the thread.Message ID: @.***>
-- Brad
I'm trying to track down an issue with a CSV file where if I load the raw file as I received it from a PayPal download I get a illegal quoting online one error however if I truncate the large file and only load the first say 50 lines I do not get this malformed CSV error illegal quoting online one error and the contents load as expected.
I've also been able to take that file loaded in Libra office then save it as a CSV it strips out many unnecessary quotes around fields and the CSV loads as expected without error. Another difference between the two different files other than removed quotes is original CSV is UTF-8-BOM while copy is UTF-8 (so says NotePad++). I've tried encoding:'UTF-8-BOM' but get same error
I would like to figure out what the issue is but need some help tracking down the error, is it a problem with the file or the lib? How can I track down exactly which quote what character or whatnot in the original CSV file is causing the error so I can track down what's wrong?
Thanks for the help!