seanharr11 / etlalchemy

Extract, Transform, Load: Any SQL Database in 4 lines of Code.
MIT License
555 stars 83 forks source link

Hi, when try migrate mysql to postgresql i have this error #9

Closed CARocha closed 7 years ago

CARocha commented 7 years ago

psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xf3 0x6e 0x20 0x50

my db mysql have a lot spanish words like ratón,colchón etc many words with accented(google translate this words :))

any help much appreciation

seanharr11 commented 7 years ago

@CARocha, is looks like your MySQL data is encoded in latin1. Are you running MySQL on a Windows machine, or on Linux?

This is likely an issue where PostgreSQL's COPY FROM command expects UTF-8 encoded characters, and MySQL is sending it Latin-1 encoded characters through ETLAlchemy.

You can change the encoding of the MySQL database using this StackOverflow answer, but as of now, ETLAlchemy expects utf8 encoded text.

CARocha commented 7 years ago

Hi, sorry for reply later, yes my MySQL running on Linux machine, and here is my table

captura de pantalla 2017-02-08 a las 10 03 28 p m

i see is utf8 all tables :/ what's wrong? you see @seanharr11

thank for you help :)

seanharr11 commented 7 years ago

@CARocha - are you able to send me that table? Or can you send me a few rows from the table in a mysqldump file?

CARocha commented 7 years ago

Hi @seanharr11 here is the dump table file https://mega.nz/#!h5d1jQzJ this es key !gzW1fpDyS-DyT6cHSmypML-E3N7H8C7yKdkiJUHExoQ

thank for you help

seanharr11 commented 7 years ago

@CARocha the hexadecimal "codepoints" ( 0xf3 0x6e 0x20 0x50 ) decode (via utf8) to ón P. This substring is found in many rows of the table, and I am not sure why this would be causing psycopg2 to choke.

Are you able to isolate the exact row that causes ETLAlchemy to fail? (There are 287 rows with this substring in your table).

seanharr11 commented 7 years ago

When trying to reproduce, I got a different error:

psycopg2.IntegrityError: null value in column "post_date_gmt" violates not-null constraint

DETAIL: Failing row contains (163947, 0, 2015-07-02 13:18:57, null, eyJ0aXRsZSI6Ik1pbmlhdHVyYXMgTmV4dEdFTiBCXHUwMGUxc2ljYXMiLCJlbnRp..., Miniaturas NextGEN Básicas, , draft, open, open, , , , , 2015-07-02 13:18:57, null, eyJ0aXRsZSI6Ik1pbmlhdHVyYXMgTmV4dEdFTiBCXHUwMGUxc2ljYXMiLCJlbnRp..., 0, http://www.tn8.tv/?post_type=display_type&p=163947, 0, display_type, , 0).

CONTEXT: COPY wp_xkim_posts, line 233195: "163947,0,2015-07-02 13:18:57,NULL,"eyJ0aXRsZSI6Ik1pbmlhdHVyYXMgTmV4dEdFTiBCXHUwMGUxc2ljYXMiLCJlbnRpd..."

CARocha commented 7 years ago

@seanharr11 yea i have many error launch pyscopg2 to migrate the database, thank for you all helps i think is imposible migrate this database to postgresql in this ways thanks