dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.45k stars 548 forks source link

MySQL to Postgres Data Only with Truncate target table not truncating table before loading. #1591

Open hcpatel opened 4 months ago

hcpatel commented 4 months ago
pgloader version "3.6.9"
compiled with SBCL 2.3.8

No.

Yes.

Load file with:

 LOAD DATABASE
    FROM mysql://<connection string>
    INTO postgresql://<connection string>

 WITH data only, truncate, disable triggers, downcase identifiers, reset sequences,
      workers = 8, concurrency = 1, prefetch rows = 500, batch rows = 500,
      single reader per thread, on error stop
2024-06-26T12:12:52.888381-04:00 LOG report summary reset
                                         table name     errors       read   imported      bytes      total time       read      write
---------------------------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                                    fetch meta data          0        109        109                     0.201s
                                  Drop Foreign Keys          0          0          0                     0.000s
                                           Truncate          0         53         53                     0.255s
---------------------------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------

I see it indicated 'Truncate' , but I suspect table is not being truncated as subsequently the COPY fails with 'Duplicate Primary key constraint violation'. After the run I still see the old rows in the table.

I am looking to migrate data only, the script failed, on re-running was expecting it to 'detect there is data in the target table on postgres and expecting it to truncate the table prior to running the COPY of the data.

Is there a way to DUMP/Log the SQL pgloader runs for 'truncate'? so I can verify the '53 truncates' it indicates above?

hcpatel commented 4 months ago

I did a test with just this one table in MySQL. I am using a MATERIALIZE VIEWS to copy the data and finding it does not 'truncate' the data in target Postgres table when I use MATERIALIZE VIEWS. My Load file has:

MATERIALIZE VIEWS view_name AS $$ <select from original_table_table> $$
EXCLUDING TABLE NAMES MATCHING 'original_table_name'
INCLUDING ONLY TABLE NAMES MATCHING 'view_name'
 ALTER TABLE NAMES MATCHING 'view_name' RENAME TO 'pg_table_name'

The result indicate no Truncate run:

2024-06-27T05:28:24.744574-04:00 LOG report summary reset
             table name     errors       read   imported      bytes      total time       read      write
-----------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
        fetch meta data          0          1          1                     0.094s
      Drop Foreign Keys          0          0          0                     0.000s
               Truncate          0          0          0                     0.000s
-----------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
   public.pg_table_name         1          2          0                     0.029s     0.012s     0.000s
-----------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
COPY Threads Completion          0          8          8                     0.029s
        Reset Sequences          0        203        203                     0.394s
    Create Foreign Keys          0          0          0                     0.000s
       Install Comments          0          0          0                     0.000s
-----------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
      Total import time          1          2          0                     0.423s

So how can I specify the 'target table should be truncated' (e.g. pg_table_name) when using MATERIALIZE VIEW with renaming tables from source to target?