ruby / csv

CSV Reading and Writing
https://ruby.github.io/csv/
BSD 2-Clause "Simplified" License
178 stars 114 forks source link

ArgumentError: unknown encoding name - iso-8859-1|utf-8 #254

Closed rstueven closed 2 years ago

rstueven commented 2 years ago

Ruby 2.7.6 / Rails 6.0.5

This line worked without error in Ruby 2.4.6: csv_rows = CSV.parse(file_contents, headers: true, encoding: 'iso-8859-1|utf-8')

It fails with this error in Ruby 2.7.6: ArgumentError: unknown encoding name - iso-8859-1|utf-8

I have also tried ISO-8859-1|UTF-8, iso-8859-1:utf-8, and ISO-8859-1:UTF-8, and they give the same error.

Both encodings exist, but it doesn't find the combination:

[3] pry(main)> Encoding.find("iso-8859-1|utf-8")
ArgumentError: unknown encoding name - iso-8859-1|utf-8
from (pry):3:in `find'
[4] pry(main)> Encoding.find("iso-8859-1")
=> #<Encoding:ISO-8859-1>
[5] pry(main)> Encoding.find("utf-8")
=> #<Encoding:UTF-8>
[6] pry(main)> Encoding.find("iso-8859-1:utf-8")
ArgumentError: unknown encoding name - iso-8859-1:utf-8
from (pry):7:in `find'

This was supposedly fixed in Issue #23 , but I'm still getting the error.

kou commented 2 years ago

In this case, encoding: isn't used in Ruby 2.4.6. file_contents.encoding is used.

And iso-8859-1|utf-8 is invalid syntax. You need to use : instead of |: iso-8859-1:utf-8

Anyway, it's a backward incompatible that csv in Ruby 2.7 raises an exception for iso-8859-1:utf-8. So I've added support for transcoding input data even when input data is String. Because csv transcodes input data with iso-8859-1:utf-8 when input data is IO.

In your case, could you use the following code? This will work with csv in old Ruby too.

csv_rows = CSV.parse(file_contents.encode("UTF-8", "ISO-8859-1"), headers: true)
rstueven commented 2 years ago

In your case, could you use the following code? This will work with csv in old Ruby too.

csv_rows = CSV.parse(file_contents.encode("UTF-8", "ISO-8859-1"), headers: true)

It looks like that worked. Thanks!