dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.42k stars 546 forks source link

Strange error while using "quote identifiers." #489

Closed hickeroar closed 7 years ago

hickeroar commented 7 years ago

Moving data from a mysql database to a pgsql one. I have case sensitive fields in pgsql so I have to use the quote identifiers option. The tables already exist in both databases.

Here's the .load file I'm using:

LOAD DATABASE
     FROM      <mysql connection string>
     INTO      <pgsql connection string>

WITH include no drop, create no tables, create no indexes, quote identifiers

SET maintenance_work_mem to '128MB',
    work_mem to '12MB',
    search_path to 'reporting'

CAST type datetime to timestamptz using zero-dates-to-null,
     type datetime when default '0000-00-00 00:00:00' to timestamp using zero-dates-to-null,
     type datetime to timestamp drop default drop not null using zero-dates-to-null,
     type tinyint to boolean using tinyint-to-boolean,
     type int with extra auto_increment to integer drop typemod keep default keep not null,
     type int to integer drop typemod keep default keep not null,
     type decimal to numeric,
     type bigint to bigint

INCLUDING ONLY TABLE NAMES MATCHING 'subscriptions', 'transactions', 'recycling'
EXCLUDING TABLE NAMES MATCHING 'history'
;

When I run pgloader, I get this:

$ pgloader reporting.load
2016-12-08T05:56:19.022000Z LOG Main logs in '/tmp/pgloader/pgloader.log'
2016-12-08T05:56:19.033000Z LOG Data errors in '/tmp/pgloader/'
2016-12-08T05:56:19.037000Z LOG Parsing commands from file #P"reporting.load"
2016-12-08T05:56:22.451000Z ERROR pgloader failed to find target table for source NIL."recycling" with name "\"recycling\"" in target catalog
2016-12-08T05:56:22.452000Z ERROR pgloader failed to find target table for source NIL."subscriptions" with name "\"subscriptions\"" in target catalog
2016-12-08T05:56:22.453000Z ERROR pgloader failed to find target table for source NIL."transactions" with name "\"transactions\"" in target catalog
2016-12-08T05:56:22.453000Z LOG Skipping "recycling"
2016-12-08T05:56:22.454000Z LOG Skipping "subscriptions"
2016-12-08T05:56:22.454000Z LOG Skipping "transactions"
             table name       read   imported     errors      total time       read      write
-----------------------  ---------  ---------  ---------  --------------  ---------  ---------
        fetch meta data          5          5          0          0.123s
      Drop Foreign Keys          0          0          0          0.000s
           Drop Indexes          0          0          0          0.000s
-----------------------  ---------  ---------  ---------  --------------  ---------  ---------
-----------------------  ---------  ---------  ---------  --------------  ---------  ---------
COPY Threads Completion          0          0          0          0.002s
        Reset Sequences          0          9          0          2.072s
           Primary Keys          1          1          0          0.000s
    Create Foreign Keys          0          0          0          0.000s
       Install Comments          0          0          0          0.000s

I can't find this error or this behavior anywhere else. Anyone have a clue what's going on?

Using version 3.3.1

dimitri commented 7 years ago

I just tried to reproduce with current version and failed. Is is possible for you to try with 3.3.2 (currently same as a built from sources).

hickeroar commented 7 years ago

Just compiled and tried it on 3.3.2. Same exact behavior. Gonna pull my hair out. :-P

dimitri commented 7 years ago

Can you send me a dump of your database schema (without data, seems not to be needed here), so that I may reproduce the bug locally then fix it?

hickeroar commented 7 years ago

Here's the transactions table schema for you. I figure that should be enough.

Here's the pgsql create code:

CREATE TABLE reporting.transactions
(
    id integer NOT NULL DEFAULT nextval('transactions_id_seq'::regclass),
    trans_time timestamp with time zone NOT NULL,
    function character varying(255) COLLATE "default".pg_catalog NOT NULL,
    user_id integer NOT NULL,
    order_id integer NOT NULL,
    amount numeric(10, 2) NOT NULL,
    "reportGroup" character varying(255) COLLATE "default".pg_catalog NOT NULL,
    transaction_id bigint NOT NULL,
    response integer NOT NULL,
    message character varying(50) COLLATE "default".pg_catalog NOT NULL,
    token bigint NOT NULL,
    card_expdate character varying(50) COLLATE "default".pg_catalog NOT NULL,
    bank_code integer NOT NULL,
    gateway character varying(50) COLLATE "default".pg_catalog NOT NULL,
    created_at timestamp with time zone,
    modified_at timestamp with time zone
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE reporting.transactions
    OWNER to reporting;

And the mysql create code:

CREATE TABLE `transactions` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `trans_time` DATETIME NOT NULL,
    `function` VARCHAR(255) NOT NULL,
    `order_id` INT(11) NOT NULL,
    `amount` DECIMAL(10,2) NOT NULL,
    `reportGroup` VARCHAR(255) NOT NULL,
    `transaction_id` BIGINT(20) NOT NULL,
    `response` INT(11) NOT NULL,
    `message` VARCHAR(50) NOT NULL,
    `token` BIGINT(20) NOT NULL,
    `card_expdate` VARCHAR(50) NOT NULL,
    `bank_code` INT(11) NOT NULL,
    `user_id` INT(11) NOT NULL,
    `created_at` DATETIME NULL DEFAULT NULL,
    `modified_at` DATETIME NULL DEFAULT NULL,
    `gateway` VARCHAR(50) NOT NULL,
    PRIMARY KEY (`id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=234491
;

Mysql version is 5.7.16-0ubuntu0.16.04.1. Pgsql version is whatever the latest in amazon RDB is.

dimitri commented 7 years ago

Can you please try with a fresh build of pgloader and adding a line like the following in your load file:

ALTER SCHEMA 'mydbname' RENAME TO 'reporting'
hickeroar commented 7 years ago

Sure I'll give that a try here in a few minutes and report back.

hickeroar commented 7 years ago

I added this line (source database is tmom_reports): ALTER SCHEMA 't_reports' RENAME TO 'reporting'

These are the errors I'm getting now:

2016-12-19T04:21:34.083000Z ERROR pgloader failed to find target table for source "t_reports"."recycling" with name "\"recycling\"" in target catalog
2016-12-19T04:21:34.083000Z ERROR pgloader failed to find target table for source "t_reports"."subscriptions" with name "\"subscriptions\"" in target catalog
2016-12-19T04:21:34.084000Z ERROR pgloader failed to find target table for source "t_reports"."transactions" with name "\"transactions\"" in target catalog

I found a really wonky workaround though, and I'm not sure why it works:

...
ALTER SCHEMA 't_reports' RENAME TO 'reporting'
ALTER TABLE NAMES MATCHING 'subscriptions' RENAME TO 'subscriptions'
ALTER TABLE NAMES MATCHING 'transactions' RENAME TO 'transactions'
ALTER TABLE NAMES MATCHING 'recycling' RENAME TO 'recycling'

If I "rename" each table to its own name, it works...

dimitri commented 7 years ago

I worked today at reproducing the errors here but couldn't. The load file is following. Closing the issue!

LOAD DATABASE
     FROM mysql://root@localhost/db489
     INTO pgsql:///db489

ALTER SCHEMA 'db489' RENAME TO 'reporting'

WITH include no drop, create no tables, create no indexes, quote identifiers

SET maintenance_work_mem to '128MB',
    work_mem to '12MB',
    search_path to 'reporting'

CAST type datetime to timestamptz using zero-dates-to-null,
     type datetime when default '0000-00-00 00:00:00' to timestamp using zero-dates-to-null,
     type datetime to timestamp drop default drop not null using zero-dates-to-null,
     type tinyint to boolean using tinyint-to-boolean,
     type int with extra auto_increment to integer drop typemod keep default keep not null,
     type int to integer drop typemod keep default keep not null,
     type decimal to numeric,
     type bigint to bigint

INCLUDING ONLY TABLE NAMES MATCHING 'subscriptions', 'transactions', 'recycling'
EXCLUDING TABLE NAMES MATCHING 'history';