darold / pgbadger

A fast PostgreSQL Log Analyzer
http://pgbadger.darold.net/
PostgreSQL License
3.49k stars 349 forks source link

Generation slow in Pgbadger 11.8 #753

Closed phlaluna closed 1 year ago

phlaluna commented 1 year ago

Hi, I upgraded the pgbadger 11.3 to 11.8, but the generate reports is very slow.

Pgbadger 11.3, take about 3 minutes:

[ec2-user@monitor-teste ~]$ zcat ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz | pgbadger-11.3 --timezone -03 --format stderr --verbose --prefix '%t:%r:%u@%d:[%p]:' -o pgbadger-11-3.bin -
DEBUG: Output 'binary' reports will be written to pgbadger-11-3.bin
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 799823492 bytes, queries: 90232157, events: 0
DEBUG: the log statistics gathering took:147 wallclock secs ( 1.75 usr  0.20 sys + 141.04 cusr  1.47 csys = 144.46 CPU)
DEBUG: Output 'binary' reports will be written to pgbadger-11-3.bin
LOG: Ok, generating binary report...
DEBUG: building reports took:  1 wallclock secs ( 0.50 usr +  0.07 sys =  0.57 CPU)
DEBUG: the total execution time took: 148 wallclock secs ( 2.25 usr  0.27 sys + 141.04 cusr  1.47 csys = 145.03 CPU)
[ec2-user@monitor-teste ~]$ ls -lh pgbadger-11-3.bin 
-rw-rw-r-- 1 ec2-user ec2-user 83M set 13 10:52 pgbadger-11-3.bin

Pgbadger 11.8, it's hours in the same stage:

[ec2-user@monitor-teste ~]$ zcat ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz | pgbadger --last-parsed pgbadger_last_state_file --timezone -03 --verbose --prefix '%t:%r:%u@%d:[%p]:' -o pgbadger-11-8.bin -
DEBUG: pgBadger version 11.8.
DEBUG: Output 'binary' reports will be written to pgbadger-11-8.bin
DEBUG: waiting for log entries from stdin.
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
DEBUG: setting current position in log to 799823492: 0
[+] Parsed 799823492 bytes, queries: 206523, events: 0

Pgbadger 11.7 with problems too:

[ec2-user@monitor-teste ~]$ pgbadger --timezone -03 --verbose --prefix '%t:%r:%u@%d:[%p]:' -o pgbadger-11-7.bin ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz
DEBUG: pgBadger version 11.7.
DEBUG: Output 'binary' reports will be written to pgbadger-11-7.bin
DEBUG: Looking for file size using command: gunzip -l "ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz" | grep -E '^\s*[0-9]+' | awk '{print $2}'
DEBUG: Starting progressbar writer process
DEBUG: Compressed log file, will use command: gunzip -c "ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz"
DEBUG: Autodetected log format 'rds' from ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz
DEBUG: Autodetected UTC timestamp
DEBUG: pgBadger will use log format rds to parse ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz.
DEBUG: Processing log file: ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz
DEBUG: Compressed log file, will use command: gunzip -c "ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz"
DEBUG: Starting reading file "ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz"...0
DEBUG: Start parsing postgresql log at offset 0 of file "ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz" to 799823492
[========================>] Parsed 799823492 bytes of 799823492 (100.00%), queries: 206523, events: 0

This log is from a replica database of an AWS Aurora Postgresql 11.15.

Am I doing something wrong with newer versions of pgbadger?

phlaluna commented 1 year ago

In my test, until the pgbadger 11.6 worked, after 11.7 is't work:

[ec2-user@monitor-teste ~]$ zcat ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz | pgbadger-11-6 --timezone -03 --verbose --prefix '%t:%r:%u@%d:[%p]:' -o pgbadger-11-6.bin -
DEBUG: pgBadger version 11.6.
DEBUG: Output 'binary' reports will be written to pgbadger-11-6.bin
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 799823492 bytes, queries: 90232157, events: 0
DEBUG: the log statistics gathering took:160 wallclock secs ( 1.76 usr  0.23 sys + 155.15 cusr  1.49 csys = 158.63 CPU)
DEBUG: Output 'binary' reports will be written to pgbadger-11-6.bin
LOG: Ok, generating binary report...
DEBUG: building reports took:  1 wallclock secs ( 0.51 usr +  0.10 sys =  0.61 CPU)
DEBUG: the total execution time took: 161 wallclock secs ( 2.27 usr  0.33 sys + 155.15 cusr  1.49 csys = 159.24 CPU)
darold commented 1 year ago

Please use pgbadger version 12.0 published this week a major performances enhancement have been done on bind parameters replacement. This could be the source of the problem here.

Please let us know the result.

phlaluna commented 1 year ago

Hi, thanks for answering

The pgbadger version 12.0 is also having problem.

[ec2-user@monitor-teste ~]$ zcat ras-rds-2-postgresql.log.2022-09-09-12-1662728400018.log.gz | pgbadger-12 --timezone -03 --verbose --prefix '%t:%r:%u@%d:[%p]:' -o ras-rds-2-pgbadger-12.bin -
DEBUG: pgBadger version 12.0.
DEBUG: Output 'binary' reports will be written to ras-rds-2-pgbadger-12.bin
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 799823492 bytes, queries: 206523, events: 0
phlaluna commented 1 year ago

I tested it with the master database log (writer), it worked correctly. My database is a Cluster AWS Aurora PostgreSQL 11.15 with two instance (writer and reader).

[ec2-user@monitor-teste ~]$ zcat ras-rds-1-postgresql.log.2022-09-14-17-1663178400026.log.gz | pgbadger-12 --timezone -03 --verbose --prefix '%t:%r:%u@%d:[%p]:' -o ras-rds-1-pgbadger-12.bin -
DEBUG: pgBadger version 12.0.
DEBUG: Output 'binary' reports will be written to ras-rds-1-pgbadger-12.bin
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 2158634035 bytes, queries: 1946371, events: 19676
DEBUG: the log statistics gathering took:739 wallclock secs ( 0.63 usr  0.07 sys + 726.83 cusr  4.43 csys = 731.96 CPU)
DEBUG: Output 'binary' reports will be written to ras-rds-1-pgbadger-12.bin
LOG: Ok, generating binary report...
DEBUG: building reports took:  1 wallclock secs ( 0.18 usr +  0.01 sys =  0.19 CPU)
DEBUG: the total execution time took: 740 wallclock secs ( 0.81 usr  0.08 sys + 726.83 cusr  4.43 csys = 732.15 CPU)
darold commented 1 year ago

Good to know, that was the fix. Thanks.

phlaluna commented 1 year ago

No, that was not the fix. I want to create a report from the read database. I'm informing that pgbadger is correctly creating the report for the writer (master) database instance but not creating the report for the reader (slave) database instance.

darold commented 1 year ago

Ah ok, sorry I have not understood that your assertion "I tested it with the master database log (writer), it worked correctly." was only for one server.

If possible send to my private email gilles AT darold DOT net the log file from the secondary (readonly) database instance otherwise I will not able to reproduce the problem.

Thanks.

phlaluna commented 1 year ago

Resolved: The reason is commit 9d6a627a introduced in v11.7 that add storage the bind/prepare queries. Before that the statistics of bind/prepare was lost when using binary or incremental output.

Thanks a lot Gilles!