toy / dump

Rails app rake and capistrano tasks to create and restore dumps of database and assets
MIT License
89 stars 14 forks source link

Restore data on PostgreSQL returns : PG::Error: incomplete multibyte character #21

Open endersonmaia opened 8 years ago

endersonmaia commented 8 years ago

I'm using this to migrate a Rails database from SQL Server 2008 to PostgreSQL 9.4.

The error PG::Error: incomplete multibyte character is about UNICODE.

All tables/fields works fines without errors during restore or breaking charset, except for Postgres' text bytea field that has gziped content.

endersonmaia commented 8 years ago

Using this simple script

#!/usr/bin/env ruby
require 'sequel'

DB_src = Sequel.connect('tinytds://HOST/redmine')
DB_dst = Sequel.connect('postgres://HOST/redmine')

src_wcv = DB_src[:wiki_content_versions]
dst_wcv = DB_dst[:wiki_content_versions]

src_wcv.each { |r| dst_wcv << r }

I get this error

PG::CharacterNotInRepertoire: ERROR:  invalid byte sequence for encoding "UTF8": 0xda 0xcb (Sequel::DatabaseError)

I think since the filed is of type bytea, it shouldn't complain about UTF8 byte sequence.

Fetching the data via irb, the output is this

irb(main):016:0> wcv.first[:data]
=> "x\xDA\xCB0\xD4S\b\xCF\xCC\xCE\x04\x00\n)\x02|"

I know that Sequel and dump are differente projects, but I think the errors are related, right ?