ricardochimal / taps

simple database import/export app
MIT License
1.2k stars 139 forks source link

invalid byte sequence for encoding "UTF8" #39

Open fedegl opened 14 years ago

fedegl commented 14 years ago

iam trying tu push export a table to a heroku app, both databases are on utf8 but iam still getting the invalid byte sequence error. I have tried to make encoding explicit with heroku db:push mysql://username:passw...@localhost/db_name?encoding=utf8 but it still doesnt work.

The error is thrown on a record with accents. The string is "Partido Acción Nacional"

!!! Caught Server Exception HTTP CODE: 500 Taps Server Error: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xf36e204e HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

ricardochimal commented 14 years ago

if you can, can you send me a sample mysqldump of your database so I can test against it?

fedegl commented 14 years ago

Dump of table parties

------------------------------------------------------------

LOCK TABLES parties WRITE; /!40000 ALTER TABLE parties DISABLE KEYS /; INSERT INTO parties (id,name,abbr) VALUES (1,'Partido Revolucionario Institucional','PRI'), (2,'Partido Acción Nacional','PAN'), (3,'Partido de la Revolución Democrática','PRD'), (4,'Partido del Trabajo','PT'), (5,'Partido Verde Ecologista de México','PVEM'), (6,'Partido Convergencia','Convergencia'), (7,'Partido Nueva Alianza','Nueva Alianza');

/!40000 ALTER TABLE parties ENABLE KEYS /; UNLOCK TABLES;

Dump of table states

------------------------------------------------------------

LOCK TABLES states WRITE; /!40000 ALTER TABLE states DISABLE KEYS /; INSERT INTO states (id,name,abbr,short2,short3,region_id,subdomain) VALUES (1,'Aguascalientes','Ags','AG','AGU',2,NULL), (2,'Baja California','B.C.','BC','BCN',1,NULL), (3,'Baja California Sur','B.C.S.','BS','BCS',1,NULL), (4,'Campeche','Camp','CM','CAM',3,NULL), (5,'Coahuila','Coah','CO','COA',2,NULL), (6,'Colima','Col','CL','COL',5,NULL), (7,'Chiapas','Chis','CS','CHP',3,NULL), (8,'Chihuahua','Chih','CH','CHH',1,NULL), (9,'Distrito Federal','D.F.','DF','DIF',4,NULL), (10,'Durango','Dgo','DG','DUR',1,NULL), (11,'Guanajuato','Gto','GT','GUA',2,NULL), (12,'Guerrero','Gro','GR','GRO',4,NULL), (13,'Hidalgo','Hgo','HG','HID',5,NULL), (14,'Jalisco','Jal','JA','JAL',1,NULL), (15,'México','Mex','ME','MEX',5,NULL), (16,'Michoacán','Mich','MI','MIC',5,NULL), (17,'Morelos','Mor','MO','MOR',4,NULL), (18,'Nayarit','Nay','NA','NAY',1,NULL), (19,'Nuevo León','N.L.','NL','NLE',2,NULL), (20,'Oaxaca','Oax','OA','OAX',3,NULL), (21,'Puebla','Pue','PB','PUE',4,NULL), (22,'Querétaro','Qro','QT','QUE',2,'queretaro'), (23,'Quintana Roo','Q. Roo','QR','ROO',3,NULL), (24,'San Luís Potosí','S.L.P.','SL','SLP',2,NULL), (25,'Sinaloa','Sin','SI','SIN',1,NULL), (26,'Sonora','Son','SO','SON',1,NULL), (27,'Tabasco','Tab','TB','TAB',3,NULL), (28,'Tamaulipas','Tamps','TM','TAM',2,NULL), (29,'Tlaxcala','Tlax','TL','TLA',4,NULL), (30,'Veracruz','Ver','VE','VER',3,NULL), (31,'Yucatán','Yuc','YU','YUC',3,NULL), (32,'Zacatecas','Zac','ZA','ZAC',2,NULL);

/!40000 ALTER TABLE states ENABLE KEYS /; UNLOCK TABLES;

ricardochimal commented 14 years ago

can you link to a file that I can download so the character encoding is maintained? Thanks.

fedegl commented 14 years ago

http://docs.google.com/leaf?id=0B7KxNJE86T_2ZTJhYTIwZjktZjczMi00NTNkLTkzNjItYWM2ZTUyZDNjY2Y4&hl=en

ricardochimal commented 14 years ago

thanks, I'll take a look at it.

ricardochimal commented 14 years ago

can you try updating to taps 0.3.9 ? it should pull in sequel 0.12.1 which includes a fix for what I think is the problem.

Let me know if it fixed it

alexreisner commented 13 years ago

For anyone else that is having this problem transferring from a MySQL database with non-ASCII characters, make sure your MySQL client is set for UTF8 by default. That is, in your my.cnf, in the [client] section you should have:

default-character-set = utf8

I was about to confirm the above bug when I realized I had this problem.

jerome commented 13 years ago

Running a true utf8 configuration for mysqld and client, I do experience the same error for a huge database I push to heroku. Manually deleting the blamed rows (reading last_fetched ids in the push_xxx.dat) isn't the solution since they are too many. Thanks for your help

jsmpereira commented 13 years ago

I had this problem recently and using

heroku db:push mysql://username:pass@localhost/db_name?encoding=utf8

worked for me.