darold / pgbadger

A fast PostgreSQL Log Analyzer
http://pgbadger.darold.net/
PostgreSQL License
3.49k stars 349 forks source link

pgbouncer parsing does not work from STDIN #748

Closed aleszeleny closed 1 year ago

aleszeleny commented 2 years ago

Testes on the master branch commit 2a99d51d149d3aa82dcc598ce1e49847b993e443:

Parsing file works like a charm (pgBouncer pgbouncer-1.12.0-1.rhel7.x86_64 - format pgbopuncer3 correctly detected):

zelenya@alesnbbs:~/var/pgBadger$ pgbadger/pgbadger -O /home/zelenya/var/pgBadger/rep -v pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Looking for file size using command: gunzip -l "pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz" | grep -E '^\s*[0-9]+' | awk '{print $2}'
DEBUG: Starting progressbar writer process
DEBUG: Compressed log file, will use command: gunzip -c "pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz"
DEBUG: Autodetected log format 'pgbouncer3' from pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz
DEBUG: pgBadger will use log format pgbouncer3 to parse pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz.
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Processing log file: pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz
DEBUG: Compressed log file, will use command: gunzip -c "pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz"
DEBUG: Starting reading file "pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz"...
DEBUG: Start parsing pgbouncer log at offset 0 of file pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz to 101713701
DEBUG: Unknown pgbouncer line format: 2022-08-15 03:25:01.933 CEST [1078] LOG got SIGHUP, re-reading config
[========================>] Parsed 101713701 bytes of 101713701 (100.00%), stats: -45706, events: 0
DEBUG: the log statistics gathering took:13 wallclock secs ( 0.21 usr  0.03 sys + 12.54 cusr  0.26 csys = 13.04 CPU)
DEBUG: Output 'html' reports will be written to out.html
LOG: Ok, generating html report...
DEBUG: building reports took:  0 wallclock secs ( 0.38 usr +  0.00 sys =  0.38 CPU)
DEBUG: the total execution time took: 13 wallclock secs ( 0.59 usr  0.03 sys + 12.54 cusr  0.26 csys = 13.42 CPU)

From STDIN with specified format option, it reports "queries" instead of "stats" and nothing is parsed:

zelenya@alesnbbs:~/var/pgBadger$ gzip -dc pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz | pgbadger/pgbadger --format pgbadger3 -O /home/zelenya/var/pgBadger/rep -v -
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Can not autodetect log format, assuming pgbadger3.
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 101713701 bytes, queries: 0, events: 0
DEBUG: the log statistics gathering took: 2 wallclock secs ( 0.00 usr  0.00 sys +  0.88 cusr  0.01 csys =  0.89 CPU)
DEBUG: Output 'html' reports will be written to out.html
LOG: Ok, generating html report...
DEBUG: building reports took:  0 wallclock secs ( 0.00 usr +  0.01 sys =  0.01 CPU)
DEBUG: the total execution time took:  2 wallclock secs ( 0.00 usr  0.01 sys +  0.88 cusr  0.01 csys =  0.90 CPU)

STDIN with autodetect behaves similar way except it assumes the format is stderr:

zelenya@alesnbbs:~/var/pgBadger$ gzip -dc pgb_work/pgb.172.25.0.231.pgbouncer.log.7.gz | pgbadger/pgbadger -O /home/zelenya/var/pgBadger/rep -v -
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Can not autodetect log format, assuming stderr.
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing postgresql log at offset 0 of file "-" to -1
[+] Parsed 101713701 bytes, queries: 0, events: 0
DEBUG: the log statistics gathering took: 7 wallclock secs ( 0.00 usr  0.01 sys +  5.29 cusr  0.04 csys =  5.34 CPU)
DEBUG: Output 'html' reports will be written to out.html
LOG: Ok, generating html report...
DEBUG: building reports took:  0 wallclock secs ( 0.00 usr +  0.00 sys =  0.00 CPU)
DEBUG: the total execution time took:  7 wallclock secs ( 0.00 usr  0.01 sys +  5.29 cusr  0.04 csys =  5.34 CPU)
darold commented 2 years ago

In your second example there is a type, you use --format pgbadger3 how about if you use --format pgbouncer3?

aleszeleny commented 2 years ago

Oh, sorry for the mistake, with an explicitly defined format it works even from pipe:

zelenya@alesnbbs:~/var/pgBadger$ gzip -dc pgb_work/pgb.172.25.0.231.pgbouncer.log.1.gz | pgbadger/pgbadger --format pgbouncer3 -v -
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Can not autodetect log format, assuming pgbouncer3.
DEBUG: Starting progressbar writer process
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing pgbouncer log at offset 0 of file - to -1
DEBUG: Unknown pgbouncer line format: 2022-08-22 03:33:01.802 CEST [1078] LOG got SIGHUP, re-reading config
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:39:46.619 CEST [1078] LOG got SIGINT, shutting down
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:44:46.713 CEST [10253] LOG listening on 0.0.0.0:5432
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:44:46.713 CEST [10253] LOG listening on unix:/tmp/.s.PGSQL.5432
[+] Parsed 253080038 bytes, stats: 986976, events: 14082
DEBUG: the log statistics gathering took:30 wallclock secs ( 0.20 usr  0.01 sys + 28.78 cusr  0.12 csys = 29.11 CPU)
DEBUG: Output 'html' reports will be written to out.html
LOG: Ok, generating html report...
DEBUG: building reports took:  0 wallclock secs ( 0.37 usr +  0.01 sys =  0.38 CPU)
DEBUG: the total execution time took: 30 wallclock secs ( 0.57 usr  0.02 sys + 28.78 cusr  0.12 csys = 29.49 CPU)

I spot a difference between default parsing - like the above and incremental - STATS counter shows negative numbers in the incremental mode, but the report is created:

zelenya@alesnbbs:~/var/pgBadger$ gzip -dc pgb_work/pgb.172.25.0.231.pgbouncer.log.1.gz | pgbadger/pgbadger --format pgbouncer3 -I -O /home/zelenya/var/pgBadger/rep -v -
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: waiting for log entries from stdin.
DEBUG: Can not autodetect log format, assuming pgbouncer3.
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing pgbouncer log at offset 0 of file - to -1
DEBUG: Unknown pgbouncer line format: 2022-08-22 03:33:01.802 CEST [1078] LOG got SIGHUP, re-reading config
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:39:46.619 CEST [1078] LOG got SIGINT, shutting down
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:44:46.713 CEST [10253] LOG listening on 0.0.0.0:5432
DEBUG: Unknown pgbouncer line format: 2022-08-22 14:44:46.713 CEST [10253] LOG listening on unix:/tmp/.s.PGSQL.5432
DEBUG: Saving pgbouncer last parsed line into /tmp/tmp_LAST_PARSED (2022-08-23 03:22:01 253080038)
DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED

DEBUG: Saving last parsed line into /home/zelenya/var/pgBadger/rep/LAST_PARSED
DEBUG: the log statistics gathering took:34 wallclock secs ( 0.01 usr  0.00 sys + 32.41 cusr  0.17 csys = 32.59 CPU)
DEBUG: Building incremental report for 2022-08-22
LOG: Ok, generating HTML daily report into /home/zelenya/var/pgBadger/rep/2022/08/22/...
DEBUG: Building incremental report for 2022-08-23
LOG: Ok, generating HTML daily report into /home/zelenya/var/pgBadger/rep/2022/08/23/...
LOG: Ok, generating HTML weekly report into /home/zelenya/var/pgBadger/rep/2022/week-35/...
LOG: Ok, generating global index to access incremental reports...
DEBUG: writing global index into /home/zelenya/var/pgBadger/rep/index.html
DEBUG: building reports took:  1 wallclock secs ( 1.18 usr +  0.02 sys =  1.20 CPU)
DEBUG: the total execution time took: 35 wallclock secs ( 1.19 usr  0.02 sys + 32.41 cusr  0.17 csys = 33.79 CPU)

So the problem is only in autodetect pgnouncer3 (haven't tested other formats) log format from a pipe.

BTW, it might help to add some note to the readme option --format logtype description that there are more pgbouncer formats (2, 3), since when the default is used it fails (but in can be found in the code):

zelenya@alesnbbs:~/var/pgBadger$ gzip -dc pgb_work/pgb.172.25.0.231.pgbouncer.log.1.gz | pgbadger/pgbadger --format pgbouncer -I -O /home/zelenya/var/pgBadger/rep -v - |& head -20
DEBUG: pgBadger version 11.8.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Found log offset 253080038 in file /home/zelenya/var/pgBadger/rep/LAST_PARSED
DEBUG: waiting for log entries from stdin.
DEBUG: Can not autodetect log format, assuming pgbouncer.
DEBUG: timezone not specified, using -7200 seconds
DEBUG: Starting progressbar writer process
DEBUG: Processing log file: -
DEBUG: Starting reading file "-"...
DEBUG: Start parsing pgbouncer log at offset 0 of file - to -1
DEBUG: Unknown pgbouncer line format: 2022-08-22 03:33:01.802 CEST [1078] LOG got SIGHUP, re-reading config
DEBUG: Unknown pgbouncer line format: 2022-08-22 03:33:02.022 CEST [1078] LOG C-0x213c258: XXX/YYY@172.25.0.139:52336 login attempt: db=....
DEBUG: Unknown pgbouncer line format: 2022-08-22 03:33:02.027 CEST [1078] LOG C-0x2164438: XXX/YYY@172.25.0.139:52338 login attempt: db=....

Kind regards Ales Zeleny

darold commented 1 year ago

Commit 186679d fixes the negative value when parsing pgbouncer file. I have decide to not talk about pgbouncer2 or pgbouncer3 logformat in the doc because they are internal. I should review the way pgbadger can autodetect the real format even if pgbouncer format is chosen.