dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.45k stars 548 forks source link

Duration at the summary page is wrong #1561

Open JP95Git opened 9 months ago

JP95Git commented 9 months ago

I use pgloader to migrate a huge database from MariaDB to PostgreSQL. Based on the summary page, the migration takes too long, so I tried to adjust the parameters. Then I noticed that either the duration in the summary page or the times in the NOTICE lines are wrong.

The summary shows almost 8 hours, but the job took only 2,5 hours.

I can always reproduce this, even on different machines. All machines are virtual, running in rather big data centres at the office or at my customer.

Also reproducable with the other compiler (forget the name).

Started with

pgloader --client-min-messages=notice database.load

database.load

LOAD DATABASE
    FROM mysql://root:password@host/database
    INTO pgsql://root:password@localhost:5432/database

    WITH
        no truncate,
        create tables,
        include drop,
        create indexes,
        reset sequences,
        foreign keys,
        downcase identifiers,
        uniquify index names,
        prefetch rows = 10000,
        rows per range = 10000,
        batch rows = 10000,
        workers = 4,
        concurrency = 1

    SET MySQL PARAMETERS
        net_read_timeout = '500',
        net_write_timeout = '500'
;

I can't attach a database dump, because my database is over 200 GB. Using an empty database is to fast to compare the times.

I removed information about the tables because they are private.

JP95Git commented 9 months ago
[user@host programs]$ /path/to/pgloader/bin/pgloader --client-min-messages=notice database.load
2024-02-22T10:43:40.001437+01:00 NOTICE Starting pgloader, log system is ready.
2024-02-22T10:43:40.007689+01:00 LOG pgloader version "3.6.af8c3c1"
[...]
2024-02-23T02:34:37.652045+01:00 LOG report summary reset
                        table name     errors       read   imported      bytes      total time       read      write
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                   fetch meta data          0        211        211                     0.494s
                    Create Schemas          0          0          0                     0.001s
                  Create SQL Types          0          0          0                     0.000s
                     Create tables          0        100        100                     0.360s
                    Set Table OIDs          0         50         50                     0.002s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
[...]
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
           COPY Threads Completion          0          8          8              14h37m57.750s
                    Create Indexes          0        111        111                4h47m9.918s
            Index Build Completion          0        111        111                 48m26.715s
                   Reset Sequences          0         27         27                     0.870s
                      Primary Keys          0         49         49                     1.044s
               Create Foreign Keys          0         50         50                 24m11.139s
                   Create Triggers          0          0          0                     0.000s
                  Install Comments          0          0          0                     0.000s
                        after load          0         23         23                     0.364s
----------------------------------  ---------  ---------  ---------  ---------  --------------  ---------  ---------
                 Total import time          ✓  4449186294  4449186294   227.9 GB   20h37m47.805s

Output from the virtual server at my customer.

Duration based on the log lines = ~ 15.5 hours Duration reported by pgLoader = 20.5 hours

hvanoch commented 9 months ago

We are having the same issues. On our end it seems that the "Create Indexes" part makes no sense. As an example, this is a summary which only migrates the schema. Real time it took was 14 sec:

            table name     errors       rows      bytes      total time
----------------------  ---------  ---------  ---------  --------------
       fetch meta data          0       3405                     0.293s
        Create Schemas          0          0                     0.437s
      Create SQL Types          0          0                     0.004s
         Create tables          0       1062                     3.530s
        Set Table OIDs          0        531                     0.047s
----------------------  ---------  ---------  ---------  --------------
----------------------  ---------  ---------  ---------  --------------
Index Build Completion          0       1906                     8.285s
        Create Indexes          0       1906                  1m38.173s
       Reset Sequences          0        310                     0.144s
          Primary Keys          0        521                     0.282s
   Create Foreign Keys          0        968                     0.785s
       Create Triggers          0          0                     0.000s
      Install Comments          0        664                     0.274s
            after load          0          2                     0.013s
----------------------  ---------  ---------  ---------  --------------
     Total import time          ✓          0                  1m47.957s