rails-sqlserver / activerecord-sqlserver-adapter

SQL Server Adapter For Rails
MIT License
972 stars 558 forks source link

invalid byte sequence in UTF-8 issue #201

Closed drobazko closed 12 years ago

drobazko commented 12 years ago

Sorry for I am writing here again.

But my question on http://stackoverflow.com/questions/10570875/rails-activerecord-invalid-byte-sequence-in-utf-8-issue is still unanswered. When I try to insert zip file

    zip_file = IO.read(zip_path)

    new_row = @claim.docs.create(
        :title => 'Архів заявки',
        :ext => 'zip',
        :size => zip_file.length,
        :receive_date => Time.now,
        :efile => zip_file
    )

I get invalid byte sequence in UTF-8

I spend tons of time but didn't find an answer (

metaskills commented 12 years ago

I am not answering questions in two places. Since I am sure this is a local issue for you, I will help when I can on StackOverflow.

drobazko commented 12 years ago

Really thanks for help MetaSkills!

First, I was tried Document.create :efile => File.read('sometiny.zip') in cosole but got the same error message.

Second, eFile field has a datatype image.

Third, I am using TinyTDS but under Windows.

At last, data.force_encoding "ASCII-8BIT" helped me but no quite. When I write zip file to DB and then read it from DB to file system. That 2 files differ from each other in 1 byte.

it's strange

metaskills commented 12 years ago

1) Are you using the Windows gem that I distribute or did you install and link to FreeTDS on your own? If you are using the Windows gem, then you should be good with iconv compiled in correctly.

2) The image is not a good type, technically that and text are deprecated by SQL Server, I have only tested putting binary image data into [image] columns. Do you have the capability to test varbinary(max), that version of SQL Server has it?

3) Are you using 1.8 or 1.9? If 1.9, have you learned about Ruby's encoding like default internal, etc? This link is a good article set. http://blog.grayproductions.net/articles/understanding_m17n The TL;DR is that unlike unix where I have default ENV variables for encoding, you may not have those in the Windows world and need to do a bit of legwork to match.

metaskills commented 12 years ago

BTW, here is the test I have in place for TinyTDS for image/binary data into [image] columns. https://github.com/rails-sqlserver/tiny_tds/blob/master/test/schema_test.rb#L130

metaskills commented 12 years ago

Here is some console output using that 1px.gif from the TinyTDS project's test case zipped up on my Mac, using 1.9.3 console.

>> Encoding.default_internal
=> nil
>> Encoding.default_external
=> #<Encoding:UTF-8>
>> File.open('/Users/kencollins/Desktop/1px.gif.zip', 'rb:ASCII-8BIT') { |f| f.read.encoding }
=> #<Encoding:ASCII-8BIT>
>> File.open('/Users/kencollins/Desktop/1px.gif.zip', 'rb:ASCII-8BIT') { |f| f.read }
=> "PK\x03\x04\x14\x00\b\x00\b\x00\xCE\xB0\\@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\a\x00\x10\x001px.gifUX\f\x00]\xCD\xAFO4\x96MO\xF5\x01\x14\x00s\xF7t\xB3\xB0Ldd`d\x98\xC8\xC0\xF0\x1F\f\xFE12100(\xFEda\x11a\xF8\xCF\xA0\x03d3\x80\xE4\x19\x98\x98\\\x18\x19\xAC\x01PK\a\b\xA8\xD4\xC5y,\x00\x00\x001\x00\x00\x00PK\x01\x02\x15\x03\x14\x00\b\x00\b\x00\xCE\xB0\\@\xA8\xD4\xC5y,\x00\x00\x001\x00\x00\x00\a\x00\f\x00\x00\x00\x00\x00\x00\x00\x00@\xED\x81\x00\x00\x00\x001px.gifUX\b\x00]\xCD\xAFO4\x96MOPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00A\x00\x00\x00q\x00\x00\x00\x00\x00"
>> File.open('/Users/kencollins/Desktop/1px.gif.zip') { |f| f.read }
=> "PK\u0003\u0004\u0014\u0000\b\u0000\b\u0000ΰ\\@\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\a\u0000\u0010\u00001px.gifUX\f\u0000]ͯO4\x96MO\xF5\u0001\u0014\u0000s\xF7t\xB3\xB0Ldd`d\x98\xC8\xC0\xF0\u001F\f\xFE12100(\xFEda\u0011a\xF8Ϡ\u0003d3\x80\xE4\u0019\x98\x98\\\u0018\u0019\xAC\u0001PK\a\b\xA8\xD4\xC5y,\u0000\u0000\u00001\u0000\u0000\u0000PK\u0001\u0002\u0015\u0003\u0014\u0000\b\u0000\b\u0000ΰ\\@\xA8\xD4\xC5y,\u0000\u0000\u00001\u0000\u0000\u0000\a\u0000\f\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000@\xED\x81\u0000\u0000\u0000\u00001px.gifUX\b\u0000]ͯO4\x96MOPK\u0005\u0006\u0000\u0000\u0000\u0000\u0001\u0000\u0001\u0000A\u0000\u0000\u0000q\u0000\u0000\u0000\u0000\u0000"
>> File.open('/Users/kencollins/Desktop/1px.gif.zip') { |f| f.read.encoding }
=> #<Encoding:UTF-8>
drobazko commented 12 years ago

Thanks to you, Ken, for paying attention to my problems. I have changed image to varbinary(max) and used ''rb:ASCII-8BIT'' parameter of open function... Everything works now )))

metaskills commented 12 years ago

NP. Please update the stack exchange question too and throw some points my way.