dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.45k stars 548 forks source link

Fatal error in "pgloader": The value 123 is not of the expected type CHARACTER #106

Closed keithgabryelski closed 10 years ago

keithgabryelski commented 10 years ago

what does this error mean. here is context:

2014-08-14T19:33:06.171120Z DATA < #("5198414" "24044" "2012-06-08" "listing" "2" "0" "42493")
Fatal error in "pgloader" : 2014-08-14T19:33:06.171236Z DATA < #("5198415" "12765" "2012-06-08" "listing" "2" "0" "32319")
The value 123 is not of the expected type CHARACTER.
2014-08-14T19:33:06.171347Z DATA < #("5198416" "208" "2012-06-08" "listing" "2" "0" "12246")
2014-08-14T19:33:06.171458Z DATA < #("5198417" "19241" "2012-06-08" "listing" "2" "0" "32351")

the table it says it is accessing does not seem to be the trouble -- the source MYSQL table:

mysql> describe analytics_stats;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int(11)      | NO   | PRI | NULL    | auto_increment |
| user_id    | int(11)      | NO   | MUL | NULL    |                |
| date       | date         | NO   | MUL | NULL    |                |
| action     | varchar(255) | NO   | MUL | NULL    |                |
| hits       | int(11)      | NO   |     | 0       |                |
| unique     | int(11)      | NO   |     | 0       |                |
| listing_id | int(11)      | YES  | MUL | NULL    |                |
+------------+--------------+------+-----+---------+----------------+
7 rows in set (0.02 sec)

mysql> select * from analytics_stats where id = 5198415;
+---------+---------+------------+---------+------+--------+------------+
| id      | user_id | date       | action  | hits | unique | listing_id |
+---------+---------+------------+---------+------+--------+------------+
| 5198415 |   12765 | 2012-06-08 | listing |    2 |      0 |      32319 |
+---------+---------+------------+---------+------+--------+------------+
1 row in set (0.08 sec)

the destination POSTGRESQL machine has the row in question:

keithpgdevelopment=> select * from analytics_stats where id = 5198415;
   id    | user_id |    date    | action  | hits | unique | listing_id 
---------+---------+------------+---------+------+--------+------------
 5198415 |   12765 | 2012-06-08 | listing |    2 |      0 |      32319
(1 row)

here is the end of the output:

 (7F9D04C861F0) : 0 (PRINT-CALL-HISTORY :CONTEXT NIL :PROCESS NIL :ORIGIN NIL :DETAILED-P NIL :COUNT 1152921504606846975 :START-FRAME-NUMBER 0 :STREAM #<SYNONYM-ST\
REAM to *TERMINAL-IO* #x3020010B0C8D> :PRINT-LEVEL 2 :PRINT-LENGTH 5 :SHOW-INTERNAL-FRAMES NIL :FORMAT :TRADITIONAL) 853                                            
 (7F9D04C86358) : 1 (PRINT-BACKTRACE-TO-STREAM #<SYNONYM-STREAM to *TERMINAL-IO* #x3020010B0C8D>) 85                                                                
 (7F9D04C86388) : 2 (PRINT-BACKTRACE #<LPARALLEL.KERNEL:TASK-KILLED-ERROR #x302696C113ED> :OUTPUT #<SYNONYM-STREAM to *TERMINAL-IO* #x3020010B0C8D> :IF-EXISTS :APP\
END :VERBOSE T) 821                                                                                                                                                 
 (7F9D04C86420) : 3 (SIGNAL #<LPARALLEL.KERNEL:TASK-KILLED-ERROR #x302696C113ED>) 981                                                                               
 (7F9D04C86478) : 4 (%ERROR #<LPARALLEL.KERNEL:TASK-KILLED-ERROR #x302696C113ED> NIL 17539045854356) 117                                                            
 (7F9D04C864A0) : 5 (UNWRAP-RESULT #S(LPARALLEL.KERNEL::WRAPPED-ERROR :VALUE #<LPARALLEL.KERNEL:TASK-KILLED-ERROR #x302696C113ED>)) 77                              
 (7F9D04C864B8) : 6 (FUNCALL #'#<#<STANDARD-METHOD PGLOADER.SOURCES:COPY-FROM (PGLOADER.MYSQL:COPY-MYSQL)>> #<PGLOADER.MYSQL:COPY-MYSQL #x302696B3DCFD> :KERNEL #<K\
ERNEL :NAME "lparallel" :WORKER-COUNT 2 :USE-CALLER NIL :ALIVE T #x3020010C11BD> :TRUNCATE NIL) 869                                                                 
 (7F9D04C86570) : 7 (%%CHECK-KEYWORDS #(1 #(:KERNEL :TRUNCATE) #<METHOD-FUNCTION PGLOADER.SOURCES:COPY-FROM #>) 17539045854393) 229                                 
 (7F9D04C865F8) : 8 (FUNCALL #'#<#<STANDARD-METHOD PGLOADER.SOURCES:COPY-DATABASE (PGLOADER.MYSQL:COPY-MYSQL)>> #<PGLOADER.MYSQL:COPY-MYSQL #x3020010C1C4D> :STATE-\
BEFORE #S(PGLOADER.UTILS::PGSTATE :TABLES #<HASH-TABLE :TEST EQUAL size 3/60 #x3020010BBCBD> :TABNAMES ("create, drop" "fetch meta data" "before load") ...) :STATE\
-AFTER #S(PGLOADER.UTILS::PGSTATE :TABLES #<HASH-TABLE :TEST EQUAL size 0/60 #x3020010BAF1D> :TABNAMES NIL ...) :STATE-INDEXES #S(PGLOADER.UTILS::PGSTATE :TABLES #\
<HASH-TABLE :TEST EQUAL size 1/60 #x3020010BB54D> :TABNAMES ("Create Indexes") ...) :TRUNCATE NIL :DATA-ONLY NIL :SCHEMA-ONLY NIL :CREATE-TABLES T :INCLUDE-DROP T \
:CREATE-INDEXES T :RESET-SEQUENCES T :FOREIGN-KEYS T :IDENTIFIER-CASE :DOWNCASE :ONLY-TABLES NIL :INCLUDING NIL :EXCLUDING NIL :DECODING-AS NIL :MATERIALIZE-VIEWS \
NIL) 2517                                                                                                                                                           
 (7F9D04C867E8) : 9 (%%CHECK-KEYWORDS #(1 #(:MATERIALIZE-VIEWS :DECODING-AS :EXCLUDING :INCLUDING :IDENTIFIER-CASE ...) #<METHOD-FUNCTION PGLOADER.SOURCES:COPY-DAT\
ABASE #>) 17539045854472) 229                                                                                                                                       
 (7F9D04C86920) : 10 (FUNCALL #'#<Anonymous Function #x3020010BC2BF>) 2477                                                                                          
 (7F9D04C86B00) : 11 (RUN-COMMANDS #P"/home/ubuntu/mysql-psql.pgloader" :START-LOGGER NIL :LOG-FILENAME #P"/tmp/pgloader/pgloader.log" :LOG-MIN-MESSAGES :DATA :CLI\
ENT-MIN-MESSAGES :DATA) 2805                                                                                                                                        
 (7F9D04C86BC0) : 12 (MAIN ("sandbox/pgloader-3.0.99/build/bin/pgloader" "-v" "-d" "mysql-psql.pgloader")) 3141                                                     
 (7F9D04C86D70) : 13 (FUNCALL #'#<Anonymous Function #x300002C000DF>) 389                                                                                           
 (7F9D04C86DB0) : 14 (FUNCALL #'#<(:INTERNAL CCL::%SAVE-APPLICATION-INTERNAL)>) 85                                                                                  
 (7F9D04C86DC8) : 15 (FUNCALL #'#<(:INTERNAL CCL::MAKE-MCL-LISTENER-PROCESS)>) 645                                                                                  
 (7F9D04C86E60) : 16 (RUN-PROCESS-INITIAL-FORM #<PROCESS toplevel(2) [Active] #x3020010AFB7D> (#<COMPILED-LEXICAL-CLOSURE # #x3020010AF6AF>)) 709                   
 (7F9D04C86EF0) : 17 (FUNCALL #'#<(:INTERNAL (CCL::%PROCESS-PRESET-INTERNAL (PROCESS)))> #<PROCESS toplevel(2) [Active] #x3020010AFB7D> (#<COMPILED-LEXICAL-CLOSURE\
 # #x3020010AF6AF>)) 573                                                                                                                                            
 (7F9D04C86F98) : 18 (FUNCALL #'#<(:INTERNAL CCL::THREAD-MAKE-STARTUP-FUNCTION)>) 277                                                                               
> Debug: The task was killed.                                                                                                                                       
> While executing: PGLOADER::MAIN, in process toplevel(2).                                                                                                          
> Type :POP to abort, :R for a list of available restarts.                                                                                                          
> Type :? for other options.  

the loader file:

load database
     from      mysql://REDACTED
     into postgresql://REDACTED

 WITH include drop, create tables, no truncate,
      create indexes, reset sequences, foreign keys, batch rows = 100000, batch size = 500MB, batch concurrency = 10

  SET maintenance_work_mem to '250MB', work_mem to '250MB', search_path to 'public'

 CAST type datetime to timestamp
                drop default drop not null using zero-dates-to-null,
      type date drop not null drop default using zero-dates-to-null

 DECODING TABLE NAMES MATCHING ~/.*/ AS utf8

 BEFORE LOAD DO
 $$ create schema if not exists public; $$;

I'm currently trying to figure out what table it is actually complaining about. Any ideas how to get the actual problem table in the error output?

And is it really complaining that 123 (open brace) is not a character?

dimitri commented 10 years ago

What happens if you try with an SBCL compiler pgloader over only that table, and over the whole thing?

keithgabryelski commented 10 years ago

this small example fails:

mysql> show create table x;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                           |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| x     | CREATE TABLE `x` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `html_body` text COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=687 DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> select * from x;
+-----+-----------+
| id  | html_body |
+-----+-----------+
| 685 | {         |
+-----+-----------+
1 row in set (0.00 sec)

mysql> 

with a load file of:

load database
     from      mysql://REDACTED
     into postgresql://REDACTED

WITH include drop, create tables, no truncate,
      create indexes, reset sequences, foreign keys, batch rows = 1, batch size = 1MB, batch concurrency = 10

 SET maintenance_work_mem to '250MB', work_mem to '250MB', search_path to 'public'

 CAST type datetime to timestamp
                drop default drop not null using zero-dates-to-null,
      type date drop not null drop default using zero-dates-to-null

 INCLUDING ONLY TABLE NAMES MATCHING 'x'

 BEFORE LOAD DO
 $$ create schema if not exists public; $$;
keithgabryelski commented 10 years ago

with SBC I see something similar:

dbtrans$ pgloader -v -d findit.pgloader
STYLE-WARNING: Undefined alien: "SSLv2_client_method"
sb-impl::*default-external-format* :UTF-8
tmpdir: #P"/tmp/pgloader/"
2014-08-15T22:32:43.038000+04:00 LOG Starting pgloader, log system is ready.
2014-08-15T22:32:43.059000+04:00 INFO Starting monitor
2014-08-15T22:32:43.071000+04:00 LOG Main logs in '/tmp/pgloader/pgloader.log'
2014-08-15T22:32:43.071000+04:00 LOG Data errors in '/tmp/pgloader/'
2014-08-15T22:32:43.071000+04:00 LOG Parsing commands from file #P"/home/ubuntu/findit.pgloader"
2014-08-15T22:32:43.271000+04:00 DEBUG CONNECT
2014-08-15T22:32:43.272000+04:00 DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T22:32:43.272000+04:00 DEBUG SET work_mem TO '250MB'
2014-08-15T22:32:43.272000+04:00 DEBUG SET search_path TO 'public'
2014-08-15T22:32:43.272000+04:00 DEBUG BEGIN
2014-08-15T22:32:43.272000+04:00 DEBUG CONNECT
2014-08-15T22:32:43.272000+04:00 DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T22:32:43.272000+04:00 DEBUG SET work_mem TO '250MB'
2014-08-15T22:32:43.272000+04:00 DEBUG SET search_path TO 'public'
2014-08-15T22:32:43.272000+04:00 DEBUG BEGIN
2014-08-15T22:32:43.272000+04:00 NOTICE create schema if not exists public;

debugger invoked on a TYPE-ERROR in thread #<THREAD "lparallel" RUNNING {10079DE233}>: The value #(123) is not of type STRING.
2014-08-15T22:33:05.085000+04:00 NOTICE DROP then CREATE TABLES
2014-08-15T22:33:05.085000+04:00 DEBUG drop then create 1 tables with 1 indexes.
2014-08-15T22:33:05.085000+04:00 DEBUG CONNECT
2014-08-15T22:33:05.085000+04:00 DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T22:33:05.085000+04:00 DEBUG SET work_mem TO '250MB'
2014-08-15T22:33:05.085000+04:00 DEBUG SET search_path TO 'public'
2014-08-15T22:33:05.085000+04:00 DEBUG BEGIN
2014-08-15T22:33:05.085000+04:00 INFO DROP TABLE IF EXISTS x;
2014-08-15T22:33:05.086000+04:00 WARNING Postgres warning: table "x" does not exist, skipping
2014-08-15T22:33:05.086000+04:00 INFO CREATE TABLE x 
(
  id                     bigserial not null,
  html_body              text not null
);
2014-08-15T22:33:05.086000+04:00 DEBUG TARGET: x
2014-08-15T22:33:05.086000+04:00 NOTICE COPY x
2014-08-15T22:33:05.086000+04:00 DEBUG pgsql:copy-from-queue: x NIL
2014-08-15T22:33:05.086000+04:00 DEBUG CONNECT (REDACTED)
2014-08-15T22:33:05.086000+04:00 DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T22:33:05.086000+04:00 DEBUG SET work_mem TO '250MB'
2014-08-15T22:33:05.086000+04:00 DEBUG SET search_path TO 'public'
2014-08-15T22:33:05.086000+04:00 DATA < #("685" #(123))

and it hangs (which I've seen a lot with this build -- this is why I switched to CCL):

dbtrans$ sudo strace -p 16857
Process 16857 attached - interrupt to quit
futex(0x100a7893c8, FUTEX_WAIT_PRIVATE, 110112691, NULL^C <unfinished ...>
Process 16857 detached

grabbed this build from here:

debian wheezy, amd64    pgloader_3.0.99-1_amd64.deb
keithgabryelski commented 10 years ago

btw, hitting ^C gives me this -- no idea if this is interesting:

2014-08-15T22:33:05.086000+04:00 DATA < #("685" #(123))
^CAn unhandled error condition has been signalled:
   Interactive interrupt at #x7FFFF71D6957.

Date/time: 2014-08-15-18:39An unhandled error condition has been signalled:
                              Interactive interrupt at #x7FFFF71D6957.

Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {1006902FB3}>
0: ("bogus stack frame")
1: ((FLET #:WITHOUT-INTERRUPTS-BODY-576 :IN SB-THREAD:CONDITION-WAIT))
2: (SB-THREAD:CONDITION-WAIT #<SB-THREAD:WAITQUEUE Anonymous condition variable {100A7893B3}> #<SB-THREAD:MUTEX "Anonymous lock" (free)> :TIMEOUT NIL)
3: (LPARALLEL.THREAD-UTIL:CONDITION-WAIT #<SB-THREAD:WAITQUEUE Anonymous condition variable {100A7893B3}> #<SB-THREAD:MUTEX "Anonymous lock" (free)> :TIMEOUT NIL)
4: (LPARALLEL.CONS-QUEUE:POP-CONS-QUEUE/NO-LOCK #<unavailable argument>)
5: ((FLET #:WITHOUT-INTERRUPTS-BODY-537 :IN SB-THREAD::CALL-WITH-MUTEX))
6: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN LPARALLEL.CONS-QUEUE:POP-CONS-QUEUE) {7FFFF6C9747B}> #<SB-THREAD:MUTEX "Anonymous lock" (free)> NIL T NIL)
7: (LPARALLEL.CONS-QUEUE:POP-CONS-QUEUE #<unavailable argument>)
8: (LPARALLEL.KERNEL:RECEIVE-RESULT #S(LPARALLEL.KERNEL:CHANNEL :QUEUE #S(LPARALLEL.CONS-QUEUE:CONS-QUEUE :IMPL #S(LPARALLEL.RAW-QUEUE:RAW-QUEUE :HEAD NIL :TAIL NIL) :LOCK #<SB-THREAD:MUTEX "Anonymous lock" (free)> :CVAR #<SB-THREAD:WAITQUEUE Anonymous condition variable {100A7893B3}>) :KERNEL #<LPARALLEL.KERNEL:KERNEL :NAME "lparallel" :WORKER-COUNT 2 :USE-CALLER NIL :ALIVE T :SPIN-COUNT 2000 {10079DD763}>))
9: ((:METHOD PGLOADER.SOURCES:COPY-FROM (PGLOADER.MYSQL:COPY-MYSQL)) #<PGLOADER.MYSQL:COPY-MYSQL {100A5F5EA3}> :KERNEL #<LPARALLEL.KERNEL:KERNEL :NAME "lparallel" :WORKER-COUNT 2 :USE-CALLER NIL :ALIVE T :SPIN-COUNT 2000 {10079DD763}> :TRUNCATE NIL) [fast-method]
10: ((:METHOD PGLOADER.SOURCES:COPY-DATABASE (PGLOADER.MYSQL:COPY-MYSQL)) #<PGLOADER.MYSQL:COPY-MYSQL {100783E0B3}> :STATE-BEFORE #S(PGLOADER.UTILS::PGSTATE :TABLES #<HASH-TABLE :TEST EQUAL :COUNT 3 {1006EA3893}> :TABNAMES ("create, drop" "fetch meta data" "before load") :READ 3 :ROWS 5 :ERRS 0 :SECS 21.930998) :STATE-AFTER #S(PGLOADER.UTILS::PGSTATE :TABLES #<HASH-TABLE :TEST EQUAL :COUNT 0 {1006EA4613}> :TABNAMES NIL :READ 0 :ROWS 0 :ERRS 0 :SECS 0.0) :STATE-INDEXES #S(PGLOADER.UTILS::PGSTATE :TABLES #<HASH-TABLE :TEST EQUAL :COUNT 0 {1006EA4193}> :TABNAMES NIL :READ 0 :ROWS 0 :ERRS 0 :SECS 0.0) :TRUNCATE NIL :DATA-ONLY NIL :SCHEMA-ONLY NIL :CREATE-TABLES T :INCLUDE-DROP T :CREATE-INDEXES T :RESET-SEQUENCES T :FOREIGN-KEYS T :IDENTIFIER-CASE :DOWNCASE :ONLY-TABLES NIL :INCLUDING ("x") :EXCLUDING NIL :DECODING-AS NIL :MATERIALIZE-VIEWS NIL) [fast-method]
11: ((LAMBDA ()))
12: (PGLOADER.PARSER:RUN-COMMANDS #P"/home/ubuntu/findit.pgloader" :START-LOGGER NIL :LOG-FILENAME NIL :LOG-MIN-MESSAGES NIL :CLIENT-MIN-MESSAGES NIL)
13: ((FLET #:CLEANUP-FUN-111 :IN PGLOADER::MAIN)) [cleanup]
14: (PGLOADER::MAIN ("pgloader" "-v" "-d" "findit.pgloader"))
15: ((LAMBDA NIL :IN "/tmp/pgloader/dumper-2SKVI5f7.lisp"))
16: ((FLET #:WITHOUT-INTERRUPTS-BODY-54 :IN SAVE-LISP-AND-DIE))
17: ((LABELS SB-IMPL::RESTART-LISP :IN SAVE-LISP-AND-DIE))

debugger invoked on a SB-SYS:INTERACTIVE-INTERRUPT in thread
#<THREAD "main thread" RUNNING {1006902FB3}>:
  Interactive interrupt at #x7FFFF71D6957.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit application

debugger invoked on a TYPE-ERROR in thread
#<THREAD "main thread" RUNNING {1006902FB3}>:
  The value INVOKE-DEBUGGER is not of type SB-DI:FRAME.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit application

(SB-DEBUG::FRAME-CALL INVOKE-DEBUGGER :METHOD-FRAME-STYLE :NORMAL :REPLACE-DYNAMIC-EXTENT-OBJECTS NIL)
0[2] 2014-08-15T22:39:29.508000+04:00 FATAL We have a situation here.
keithgabryelski commented 10 years ago

and finally i just built from master, the error is not fatal, but still the data does not get transferred:

dbtrans$ sandbox/pgloader/build/bin/pgloader -v -d findit.pgloader
tmpdir: #P"/tmp/pgloader/"
2014-08-15T19:49:46.000590Z LOG Starting pgloader, log system is ready.
2014-08-15T19:49:46.001291Z INFO Starting monitor
2014-08-15T19:49:46.001448Z LOG Main logs in '/tmp/pgloader/pgloader.log'
2014-08-15T19:49:46.001631Z LOG Data errors in '/tmp/pgloader/'
2014-08-15T19:49:46.001778Z LOG Parsing commands from file #P"/home/ubuntu/findit.pgloader"
2014-08-15T19:49:46.204867Z DEBUG CONNECT
2014-08-15T19:49:46.205014Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:49:46.205153Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:49:46.205267Z DEBUG SET search_path TO 'public'
2014-08-15T19:49:46.205380Z DEBUG BEGIN
2014-08-15T19:49:46.205492Z DEBUG CONNECT
2014-08-15T19:49:46.205605Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:49:46.205730Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:49:46.205842Z DEBUG SET search_path TO 'public'
2014-08-15T19:49:46.205956Z DEBUG BEGIN
2014-08-15T19:49:46.206076Z NOTICE create schema if not exists public;

                    table name       read   imported     errors            time
------------------------------  ---------  ---------  ---------  --------------
                   before load          1          1          0          0.027s
               fetch meta data          2          2          0         23.156s
                  create, drop          0          2          0          0.060s
------------------------------  ---------  ---------  ---------  --------------
                             x          1          0          0          0.023s
        Index Build Completion          0          0          0          0.025s
------------------------------  ---------  ---------  ---------  --------------
                Create Indexes          1          1          0          0.025s
               Reset Sequences          0          1          0          0.057s
                  Foreign Keys          0          0          0          0.000s
------------------------------  ---------  ---------  ---------  --------------
             Total import time          1          0          0         23.347s

2014-08-15T19:50:09.424186Z NOTICE DROP then CREATE TABLES
2014-08-15T19:50:09.424407Z DEBUG drop then create 1 tables with 1 indexes.
2014-08-15T19:50:09.424553Z DEBUG CONNECT
2014-08-15T19:50:09.424690Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:50:09.424845Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:50:09.424999Z DEBUG SET search_path TO 'public'
2014-08-15T19:50:09.425131Z DEBUG BEGIN
2014-08-15T19:50:09.425285Z INFO DROP TABLE IF EXISTS x;
2014-08-15T19:50:09.425649Z WARNING Postgres warning: table "x" does not exist, skipping
2014-08-15T19:50:09.425806Z INFO CREATE TABLE x 
(
  id                     bigserial not null,
  html_body              text not null
);
2014-08-15T19:50:09.425977Z DEBUG TARGET: x
2014-08-15T19:50:09.426126Z NOTICE COPY x
2014-08-15T19:50:09.426308Z DEBUG pgsql:copy-from-queue: x NIL
2014-08-15T19:50:09.426511Z DEBUG CONNECT (REDACTED)
2014-08-15T19:50:09.426665Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:50:09.426807Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:50:09.426934Z DEBUG SET search_path TO 'public'
2014-08-15T19:50:09.427145Z DATA < #("685" #(123))
2014-08-15T19:50:09.427416Z ERROR The value 123 is not of the expected type CHARACTER.
2014-08-15T19:50:09.427575Z INFO COPY x done.
2014-08-15T19:50:09.427717Z NOTICE ALTER TABLE x ADD PRIMARY KEY (id);
2014-08-15T19:50:09.427864Z DEBUG CONNECT
2014-08-15T19:50:09.428000Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:50:09.428151Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:50:09.428288Z DEBUG SET search_path TO 'public'
2014-08-15T19:50:09.428423Z DEBUG BEGIN
2014-08-15T19:50:09.428561Z NOTICE Reset sequences
2014-08-15T19:50:09.428734Z DEBUG CONNECT (REDACTED)
2014-08-15T19:50:09.428887Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:50:09.429067Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:50:09.429206Z DEBUG SET search_path TO 'public'
2014-08-15T19:50:09.429345Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-15T19:50:09.429500Z DEBUG SET work_mem TO '250MB'
2014-08-15T19:50:09.429633Z DEBUG SET search_path TO 'public'
2014-08-15T19:50:09.429771Z INFO Stopping monitor
keithgabryelski commented 10 years ago

the problem is with columns marked COLLATE utf8_bin -- they seem to be translated as an array of numbers instead of characters (that is a guess).

keithgabryelski commented 10 years ago

is there a way to get pgloader to ignore collation?

keithgabryelski commented 10 years ago
    ;;                                                                                                                                                                    
    ;; cl-mysql returns binary values as a simple-array of bytes (as in                                                                                                   
    ;; ‘(UNSIGNED-BYTE 8)), that we then need to represent as proper                                                                                                      
    ;; PostgreSQL bytea input.                                                                                                                                            
    ;;                                                                                                                                                                    

I would guess this is somehow happening... even in the case of non-binary data?

keithgabryelski commented 10 years ago

it looks like pgloader must just know that utf8_bin collation columns are returned in octets (look for mysql-cs-coll-utf8-binary in the following code)

I don't particularly understand why -- the collation doesn't actually define any new information about the data (right - i'm unskilled in this area -- it seems like it just defines how comparisons should be made, not encoding)

anyway -- i'm stuck -- if you know how to proceed, please tell me.

(defun parse-text-protocol-result-column-as-text (octets column-definition)
  "Refrain from parsing data into lisp types, some application will only use                                                                                              
   the text form anyway"
  (let ((column-type (column-definition-type column-definition)))
    (cond ((= column-type +mysql-type-null+)
           nil)

          ((member column-type (list +mysql-type-bit+
                                     +mysql-type-tiny-blob+
                                     +mysql-type-medium-blob+
                                     +mysql-type-long-blob+
                                     +mysql-type-blob+
                                     +mysql-type-enum+
                                     +mysql-type-set+
                                     +mysql-type-geometry+)
                   :test #'=)
           (if (member (column-definition-v41-packet-cs-coll column-definition)
                       (list +mysql-cs-coll-utf8-binary+
                             +mysql-cs-coll-ascii-binary+
                             +mysql-cs-coll-binary+))
               octets
               (let ((encoding (column-definition-encoding column-definition)))
                 (decode-octets-to-string octets encoding))))

          (t
           (let ((encoding (column-definition-encoding column-definition)))
             (decode-octets-to-string octets encoding))))))
dimitri commented 10 years ago

Here it seems that the COLLATE has an saying on the data being considered as binary by the driver, not just the ENCODING. I will have to investigate about that, meanwhile you might be able to use the DECODING clause to force pgloader to consider this table as being just utf8?

 DECODING TABLE NAMES MATCHING 'x' AS utf-8

Another way to do it would be to force the casting machinery into considering the column as a bytea, and using the byte-vector-to-bytea transformation function on it.

keithgabryelski commented 10 years ago

when decoding table names is added the same trouble is seen:

dbtrans$ sandbox/pgloader/build/bin/pgloader -v -d findit.pgloader
tmpdir: #P"/tmp/pgloader/"
2014-08-16T18:24:02.000589Z LOG Starting pgloader, log system is ready.
2014-08-16T18:24:02.001274Z INFO Starting monitor
2014-08-16T18:24:02.001458Z LOG Main logs in '/tmp/pgloader/pgloader.log'
2014-08-16T18:24:02.001651Z LOG Data errors in '/tmp/pgloader/'
2014-08-16T18:24:02.001804Z LOG Parsing commands from file #P"/home/ubuntu/findit.pgloader"
2014-08-16T18:24:02.205243Z DEBUG CONNECT
2014-08-16T18:24:02.205424Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:02.205563Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:02.205676Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:02.205788Z DEBUG BEGIN
2014-08-16T18:24:02.205900Z DEBUG CONNECT
2014-08-16T18:24:02.206011Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:02.206133Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:02.206244Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:02.206356Z DEBUG BEGIN
2014-08-16T18:24:02.206475Z NOTICE create schema if not exists public;
2014-08-16T18:24:21.423161Z NOTICE DROP then CREATE TABLES
2014-08-16T18:24:21.423355Z DEBUG drop then create 1 tables with 6 indexes.
2014-08-16T18:24:21.423540Z DEBUG CONNECT
2014-08-16T18:24:21.423660Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:21.423790Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:21.423904Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:21.424017Z DEBUG BEGIN
2014-08-16T18:24:21.424140Z INFO DROP TABLE IF EXISTS x;
2014-08-16T18:24:21.424490Z WARNING Postgres warning: table "x" does not exist, skipping
2014-08-16T18:24:21.424634Z INFO CREATE TABLE x 
(
  id                     bigserial not null,
  created_at             timestamp,
  updated_at             timestamp,
  email_template_id      bigint not null,
  is_published           boolean not null default 'f',
  version                bigint not null default '1',
  opaque_identifier      varchar(36) not null,
  from_address           text not null,
  from_title             text not null,
  subject                text not null,
  html_body              text not null,
  text_body              text not null,
  email_template_group_id   bigint not null,
  email_group_id         bigint not null,
  email_graphical_treatment_id   bigint not null
);
2014-08-16T18:24:21.632216Z DEBUG TARGET: x
2014-08-16T18:24:21.632359Z NOTICE COPY x
2014-08-16T18:24:21.632475Z DEBUG pgsql:copy-from-queue: x NIL
2014-08-16T18:24:21.632591Z NOTICE Force encoding to UTF-8 for x
2014-08-16T18:24:21.632761Z DEBUG CONNECT (REDACTED)
2014-08-16T18:24:21.632874Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:21.632980Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:21.633089Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:21.636295Z DATA < #("14" "2014-08-12 19:31:29" NIL "14" "1" "1" "41e1382d-2257-11e4-9958-22000b4502bd" "support@custommade.com" "CustomMade" "Welcome to CustomMade!" #(123 37 32 108 111 97 100 32 117 114 108 32 102 114 111 109 32 102 117 116 117 114 101 32 37 125 10 123 37 32 108 111 97 100 32 99 108 101 97 110 117 112 95 116 97 103 115 32 37 125 13 10 60 104 49 32 115 116 121 108 101 61 34 102 111 110 116 45 115 105 122 101 58 32 49 53 112 120 59 32 108 105 110 101 45 104 101 105 103 104 116 58 32 50 48 112 120 59 32 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 56 112 120 59 34 62 72 105 32 123 123 32 102 105 114 115 116 95 110 97 109 101 124 101 115 99 97 112 101 95 102 111 114 95 101 109 97 105 108 32 125 125 44 60 47 104 49 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 87 101 108 99 111 109 101 32 116 111 32 67 117 115 116 111 109 77 97 100 101 33 32 87 101 226 128 153 114 101 32 101 120 99 105 116 101 100 32 116 111 32 115 104 97 114 101 32 121 111 117 114 32 112 111 114 116 102 111 108 105 111 32 119 105 116 104 32 116 104 101 32 116 104 111 117 115 97 110 100 115 32 111 102 32 99 117 115 116 111 109 101 114 115 32 99 111 109 105 110 103 32 116 111 32 111 117 114 32 115 105 116 101 32 100 97 105 108 121 46 32 32 87 101 226 128 153 114 101 32 104 101 114 101 32 116 111 32 104 101 108 112 32 121 111 117 32 116 97 107 101 32 102 117 108 108 32 97 100 118 97 110 116 97 103 101 32 111 102 32 101 118 101 114 121 32 111 112 112 111 114 116 117 110 105 116 121 32 103 97 105 110 32 110 101 119 32 98 117 115 105 110 101 115 115 32 116 104 114 111 117 103 104 32 99 117 115 116 111 109 109 97 100 101 46 99 111 109 46 32 60 47 112 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 84 111 32 104 101 108 112 32 121 111 117 32 103 101 116 32 115 116 97 114 116 101 100 44 32 119 101 39 118 101 32 99 114 101 97 116 101 100 32 97 32 115 112 101 99 105 97 108 32 115 101 99 116 105 111 110 32 116 104 97 116 32 111 117 116 108 105 110 101 115 32 111 117 114 32 109 111 115 116 32 102 114 101 113 117 101 110 116 108 121 32 97 115 107 101 100 32 113 117 101 115 116 105 111 110 115 32 97 115 32 119 101 108 108 32 97 115 32 116 105 112 115 32 102 114 111 109 32 111 117 114 32 109 111 115 116 32 115 117 99 99 101 115 115 102 117 108 32 67 117 115 116 111 109 77 97 100 101 32 86 101 114 105 102 105 101 100 32 109 97 107 101 114 115 58 32 60 98 114 62 13 10 13 10 60 97 32 104 114 101 102 61 34 104 116 116 112 58 47 47 104 101 108 112 46 99 117 115 116 111 109 109 97 100 101 46 99 111 109 47 109 97 107 101 114 47 103 101 116 116 105 110 103 45 115 116 97 114 116 101 100 47 34 32 115 116 121 108 101 61 34 99 111 108 111 114 58 32 35 53 50 97 54 99 100 59 34 32 116 97 114 103 101 116 61 34 95 98 108 97 110 107 34 62 104 101 108 112 46 99 117 115 116 111 109 109 97 100 101 38 35 49 55 51 59 46 99 111 109 47 109 97 107 101 114 47 103 101 116 116 105 110 103 45 115 116 97 114 116 101 100 47 60 47 97 62 60 47 112 62 13 10 13 10 123 37 32 105 102 32 105 115 95 98 97 99 107 101 110 100 32 37 125 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 89 111 117 114 32 85 115 101 114 110 97 109 101 32 105 115 58 32 123 123 32 101 109 97 105 108 32 125 125 60 47 112 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 89 111 117 32 104 97 118 101 32 98 101 101 110 32 103 105 118 101 110 32 97 32 116 101 109 112 111 114 97 114 121 32 112 97 115 115 119 111 114 100 58 32 123 123 32 112 119 100 32 125 125 60 47 112 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 80 108 101 97 115 101 32 117 115 101 32 105 116 32 116 111 32 108 111 103 105 110 32 97 110 100 32 99 114 101 97 116 101 32 97 32 112 97 115 115 119 111 114 100 32 121 111 117 32 97 114 101 32 109 111 114 101 32 99 111 109 102 111 114 116 97 98 108 101 32 119 105 116 104 58 32 119 119 119 46 99 117 115 116 111 109 109 97 100 101 38 35 49 55 51 59 46 99 111 109 47 108 111 103 105 110 47 63 110 101 120 116 61 47 112 97 115 115 119 111 114 100 47 108 111 103 103 101 100 45 105 110 45 99 104 97 110 103 101 47 60 47 112 62 13 10 13 10 123 37 32 101 108 115 101 32 37 125 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 87 104 101 110 32 121 111 117 39 114 101 32 114 101 97 100 121 32 116 111 32 115 116 97 114 116 32 99 114 101 97 116 105 110 103 32 121 111 117 114 32 112 114 111 102 105 108 101 44 32 115 105 109 112 108 121 32 117 115 101 32 121 111 117 114 32 101 109 97 105 108 32 97 100 100 114 101 115 115 32 97 110 100 32 116 104 101 32 112 97 115 115 119 111 114 100 32 99 114 101 97 116 101 100 32 100 117 114 105 110 103 32 114 101 103 105 115 116 114 97 116 105 111 110 32 116 111 32 108 111 103 32 105 110 32 104 101 114 101 58 60 98 114 62 13 10 13 10 60 97 32 104 114 101 102 61 34 104 116 116 112 58 47 47 119 119 119 46 99 117 115 116 111 109 109 97 100 101 46 99 111 109 123 37 32 117 114 108 32 39 97 117 116 104 45 108 111 103 105 110 39 32 37 125 34 32 115 116 121 108 101 61 34 99 111 108 111 114 58 32 35 53 50 97 54 99 100 59 34 32 116 97 114 103 101 116 61 34 95 98 108 97 110 107 34 62 119 119 119 46 99 117 115 116 111 109 109 97 100 101 38 35 49 55 51 59 46 99 111 109 123 37 32 117 114 108 32 39 97 117 116 104 45 108 111 103 105 110 39 32 37 125 60 47 97 62 13 10 60 47 112 62 13 10 13 10 123 37 32 101 110 100 105 102 32 37 125 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 73 102 32 121 111 117 32 110 101 101 100 32 97 100 100 105 116 105 111 110 97 108 32 115 117 112 112 111 114 116 44 32 119 101 226 128 153 114 101 32 104 101 114 101 32 116 111 32 104 101 108 112 32 226 128 147 32 121 111 117 32 99 97 110 32 99 111 110 116 97 99 116 32 111 117 114 32 115 117 112 112 111 114 116 32 116 101 97 109 32 118 105 97 32 101 109 97 105 108 32 97 116 32 115 117 112 112 111 114 116 60 115 112 97 110 62 64 60 47 115 112 97 110 62 99 117 115 116 111 109 109 97 100 101 46 99 111 109 46 60 47 112 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 84 104 97 110 107 32 121 111 117 32 97 103 97 105 110 32 102 111 114 32 106 111 105 110 105 110 103 32 67 117 115 116 111 109 77 97 100 101 46 32 87 101 226 128 153 114 101 32 101 120 99 105 116 101 100 32 116 111 32 104 101 108 112 32 121 111 117 32 103 97 105 110 32 97 99 99 101 115 115 32 116 111 32 109 111 114 101 32 99 117 115 116 111 109 101 114 115 32 97 110 100 32 103 114 111 119 32 121 111 117 114 32 98 117 115 105 110 101 115 115 46 60 47 112 62 13 10 13 10 60 112 32 115 116 121 108 101 61 34 109 97 114 103 105 110 58 32 48 59 32 112 97 100 100 105 110 103 58 32 48 32 49 50 112 120 32 49 52 112 120 59 34 62 66 101 115 116 44 60 98 114 62 13 10 13 10 74 111 101 32 75 46 60 98 114 62 13 10 13 10 77 97 107 101 114 32 83 117 99 99 101 115 115 60 47 112 62) #(123 37 32 108 111 97 100 32 117 114 108 32 102 114 111 109 32 102 117 116 117 114 101 32 37 125 10 123 37 32 108 111 97 100 32 99 108 101 97 110 117 112 95 116 97 103 115 32 37 125 13 10 72 105 32 123 123 32 102 105 114 115 116 95 110 97 109 101 124 101 115 99 97 112 101 95 102 111 114 95 101 109 97 105 108 32 125 125 44 13 10 13 10 87 101 108 99 111 109 101 32 116 111 32 67 117 115 116 111 109 77 97 100 101 33 32 87 101 226 128 153 114 101 32 101 120 99 105 116 101 100 32 116 111 32 115 104 97 114 101 32 121 111 117 114 32 112 111 114 116 102 111 108 105 111 32 119 105 116 104 32 116 104 101 32 116 104 111 117 115 97 110 100 115 32 111 102 32 99 117 115 116 111 109 101 114 115 32 99 111 109 105 110 103 32 116 111 32 111 117 114 32 115 105 116 101 32 100 97 105 108 121 46 32 32 87 101 226 128 153 114 101 32 104 101 114 101 32 116 111 32 104 101 108 112 32 121 111 117 32 116 97 107 101 32 102 117 108 108 32 97 100 118 97 110 116 97 103 101 32 111 102 32 101 118 101 114 121 32 111 112 112 111 114 116 117 110 105 116 121 32 103 97 105 110 32 110 101 119 32 98 117 115 105 110 101 115 115 32 116 104 114 111 117 103 104 32 99 117 115 116 111 109 109 97 100 101 46 99 111 109 46 32 13 10 13 10 84 111 32 104 101 108 112 32 121 111 117 32 103 101 116 32 115 116 97 114 116 101 100 44 32 119 101 39 118 101 32 99 114 101 97 116 101 100 32 97 32 115 112 101 99 105 97 108 32 115 101 99 116 105 111 110 32 116 104 97 116 32 111 117 116 108 105 110 101 115 32 111 117 114 32 109 111 115 116 32 102 114 101 113 117 101 110 116 108 121 32 97 115 107 101 100 32 113 117 101 115 116 105 111 110 115 32 97 115 32 119 101 108 108 32 97 115 32 116 105 112 115 32 102 114 111 109 32 111 117 114 32 109 111 115 116 32 115 117 99 99 101 115 115 102 117 108 32 67 117 115 116 111 109 77 97 100 101 32 86 101 114 105 102 105 101 100 32 109 97 107 101 114 115 58 32 13 10 104 116 116 112 58 47 47 104 101 108 112 46 99 117 115 116 111 109 109 97 100 101 46 99 111 109 47 109 97 107 101 114 47 103 101 116 116 105 110 103 45 115 116 97 114 116 101 100 47 13 10 13 10 123 37 32 105 102 32 105 115 95 98 97 99 107 101 110 100 32 37 125 13 10 45 89 111 117 114 32 85 115 101 114 110 97 109 101 32 105 115 58 32 123 123 32 101 109 97 105 108 32 125 125 13 10 13 10 89 111 117 32 104 97 118 101 32 98 101 101 110 32 103 105 118 101 110 32 97 32 116 101 109 112 111 114 97 114 121 32 112 97 115 115 119 111 114 100 58 32 123 123 32 112 119 100 32 125 125 13 10 13 10 80 108 101 97 115 101 32 117 115 101 32 105 116 32 116 111 32 108 111 103 105 110 32 97 110 100 32 99 114 101 97 116 101 32 97 32 112 97 115 115 119 111 114 100 32 121 111 117 32 97 114 101 32 109 111 114 101 32 99 111 109 102 111 114 116 97 98 108 101 32 119 105 116 104 58 32 104 116 116 112 58 47 47 119 119 119 46 99 117 115 116 111 109 109 97 100 101 46 99 111 109 47 108 111 103 105 110 47 63 110 101 120 116 61 47 112 97 115 115 119 111 114 100 47 108 111 103 103 101 100 45 105 110 45 99 104 97 110 103 101 47 13 10 123 37 32 101 108 115 101 32 37 125 13 10 87 104 101 110 32 121 111 117 39 114 101 32 114 101 97 100 121 32 116 111 32 115 116 97 114 116 32 99 114 101 97 116 105 110 103 32 121 111 117 114 32 112 114 111 102 105 108 101 44 32 115 105 109 112 108 121 32 117 115 101 32 121 111 117 114 32 101 109 97 105 108 32 97 100 100 114 101 115 115 32 97 110 100 32 116 104 101 32 112 97 115 115 119 111 114 100 32 99 114 101 97 116 101 100 32 100 117 114 105 110 103 32 114 101 103 105 115 116 114 97 116 105 111 110 32 116 111 32 108 111 103 32 105 110 32 104 101 114 101 58 13 10 104 116 116 112 58 47 47 119 119 119 46 99 117 115 116 111 109 109 97 100 101 46 99 111 109 123 37 32 117 114 108 32 39 97 117 116 104 45 108 111 103 105 110 39 32 37 125 13 10 123 37 32 101 110 100 105 102 32 37 125 13 10 13 10 73 102 32 121 111 117 32 110 101 101 100 32 97 100 100 105 116 105 111 110 97 108 32 115 117 112 112 111 114 116 44 32 119 101 226 128 153 114 101 32 104 101 114 101 32 116 111 32 104 101 108 112 32 226 128 147 32 121 111 117 32 99 97 110 32 99 111 110 116 97 99 116 32 111 117 114 32 115 117 112 112 111 114 116 32 116 101 97 109 32 118 105 97 32 101 109 97 105 108 32 97 116 32 115 117 112 112 111 114 116 64 99 117 115 116 111 109 109 97 100 101 46 99 111 109 46 13 10 13 10 84 104 97 110 107 32 121 111 117 32 97 103 97 105 110 32 102 111 114 32 106 111 105 110 105 110 103 32 67 117 115 116 111 109 77 97 100 101 46 32 87 101 226 128 153 114 101 32 101 120 99 105 116 101 100 32 116 111 32 104 101 108 112 32 121 111 117 32 103 97 105 110 32 97 99 99 101 115 115 32 116 111 32 109 111 114 101 32 99 117 115 116 111 109 101 114 115 32 97 110 100 32 103 114 111 119 32 121 111 117 114 32 98 117 115 105 110 101 115 115 46 13 10 13 10 66 101 115 116 44 13 10 13 10 74 111 101 32 75 46 13 10 77 97 107 101 114 32 83 117 99 99 101 115 115 13 10) "23" "11" "16")
2014-08-16T18:24:21.638302Z ERROR The value 123 is not of the expected type CHARACTER.

(each row errors like the above)

2014-08-16T18:24:24.669165Z INFO COPY x done.
2014-08-16T18:24:24.669289Z NOTICE CREATE INDEX idx_3253440_emailtemplate_etid_pub_idx ON x (email_template_id, is_published);
2014-08-16T18:24:24.669422Z NOTICE CREATE INDEX idx_3253440_emailtemplate_opaque_identifier_idx ON x (opaque_identifier);
2014-08-16T18:24:24.669554Z NOTICE CREATE INDEX idx_3253440_emailtemplate_published_idx ON x (is_published);
2014-08-16T18:24:24.669686Z NOTICE CREATE UNIQUE INDEX idx_3253440_email_template_id ON x (email_template_id, version);
2014-08-16T18:24:24.669828Z NOTICE CREATE UNIQUE INDEX idx_3253440_opaque_identifier ON x (opaque_identifier);
2014-08-16T18:24:24.669963Z NOTICE ALTER TABLE x ADD PRIMARY KEY (id);
2014-08-16T18:24:24.670080Z DEBUG CONNECT
2014-08-16T18:24:24.670191Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.670310Z DEBUG CONNECT
2014-08-16T18:24:24.670421Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.670539Z DEBUG CONNECT
2014-08-16T18:24:24.670656Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.670774Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.670886Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.671010Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.671121Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.671232Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.671345Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.671457Z DEBUG CONNECT
2014-08-16T18:24:24.671568Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.671685Z DEBUG BEGIN
2014-08-16T18:24:24.671796Z DEBUG BEGIN
2014-08-16T18:24:24.671907Z DEBUG CONNECT
2014-08-16T18:24:24.672030Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.672148Z DEBUG BEGIN
2014-08-16T18:24:24.672259Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.672370Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.672481Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.672592Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.672706Z DEBUG BEGIN
2014-08-16T18:24:24.672817Z DEBUG BEGIN
2014-08-16T18:24:24.672928Z DEBUG CONNECT
2014-08-16T18:24:24.673039Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.673157Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.673268Z NOTICE Reset sequences
2014-08-16T18:24:24.673382Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.673493Z DEBUG BEGIN
2014-08-16T18:24:24.673628Z DEBUG CONNECT (REDACTED)
2014-08-16T18:24:24.673754Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.673871Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.673982Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.674095Z DEBUG SET maintenance_work_mem TO '250MB'
2014-08-16T18:24:24.674222Z DEBUG SET work_mem TO '250MB'
2014-08-16T18:24:24.674333Z DEBUG SET search_path TO 'public'
2014-08-16T18:24:24.674444Z INFO Stopping monitor
keithgabryelski commented 10 years ago

Here it seems that the COLLATE has an saying on the data being considered as binary by the driver, not just the ENCODING

From my reading of the documentation of collation, this is an incorrect assumption.

from: http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html

This section describes how _bin collations for nonbinary strings differ from the binary “collation” for binary strings. and Binary strings (as stored in the BINARY, VARBINARY, and BLOB data types) have no character set or collation in the sense that nonbinary strings do.

The above declares 1) binary strings are never in text fields 2) collation is a thing for text fields (or at least text interpreted fields).

lastly, _bin is used as a convention to name collations that work with multibyte characters (like utf8):

Binary strings are sequences of bytes. Sorting and comparison is always based on numeric byte values. Nonbinary strings are sequences of characters, which might be multibyte. Collations for nonbinary stringa define an ordering of the character values for sorting and comparison. For the _bin collation, this ordering is based solely on binary code values of the characters (which is similar to ordering for binary strings except that a _bin collation must take into account that a character might contain multiple bytes). For other collations, character ordering might take additional factors such as lettercase into account.

http://dev.mysql.com/doc/refman/5.0/en/charset-mysql.html

There is a convention for collation names: They start with the name of the character set with which they are associated, they usually include a language name, and they end with _ci (case insensitive), _cs (case sensitive), or _bin (binary).

In the above case, "binary" just means -- byte for byte translation.

more here:

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html

keithgabryelski commented 10 years ago

so, by my reading -- it is a bug in the driver.

dimitri commented 10 years ago

I guess I'll never get used to the level of random gross hackery and WFT'ery MySQL is capable of providing. In today's session they are conflating the field's collation with the only way to separate out BLOB and TEXT data received by using the special character set value 63 only when the data actually is binary (some BLOB data type) rather than TEXT.

Making everybody's life simpler by having another set of data types ids for TEXT surely wasn't simple enough for them...

Please check that the bug is fixed for you by updating your local copy of the MySQL driver then build pgloader again:

git clone https://github.com:qitab/qmynd.git ./build/quicklisp/local-projects/qmynd
keithgabryelski commented 10 years ago

seems to work, thank you.

dimitri commented 10 years ago

Great! Have fun.