darold / pgbadger

A fast PostgreSQL Log Analyzer
http://pgbadger.darold.net/
PostgreSQL License
3.51k stars 349 forks source link

Problem with incremental reports #697

Closed hansgv closed 2 years ago

hansgv commented 2 years ago

I'm struggling with not being able to produce incremental reports. It's quite possible I have misunderstood something instead of this being a bug. In any case, I'm stuck and would appreciate help very much.

First of all, I ingest Postgres logs from RDS hourly and store them in 1 hour increment files (e.g. postgresql.log.2021-10-07-16, postgresql.log.2021-10-07-17, ...)

I process the logs with pgbadger also hourly using the incremental mode:

pgbadger --dbname mydb --format rds --incremental --extra-files --outdir /my-pgbadger-reports --html-outdir /my-pgbadger-reports /my-hourly-postgreslogs/postgresql.log*

In the initial pgbadger run, everything works as expected. All .bin files and HTML files get generated into /my-pgbadger-reports directory and its year/month/week/day subdirectories. I also get a /my-pgbadger-reports/LAST_PARSED which correctly includes the last processed log line.

After the initial pgbadger run I archive the processed logs from /my-postgreslogs before running pgbadger again. Hence, in subsequent pgbadger runs I only process logs ingested since the initial run.

In the subsequent runs pgbadger generates new .bin files of the newly processed logs, regenerates the global index HTML file, and updates the LAST_PARSED file correctly. However, it doesn't re-generate the index file of the ongoing day/week or generate new HTML files for subsequent days/weeks. Basically, HTML file generation stops at the initial run.

Subsequently executing:

pgbadget -X -I -O --rebuild /my-pgbadger-reports

Appears to re-generate the weekly HTML files but not daily.

darold commented 2 years ago

Please use latest development code, there was a bug with incremental mode and filter on database. It has been fixed in commit 489ac52. Or if you prefer to wait for an official release I will push a new one in the next few days.

hansgv commented 2 years ago

@darold I forgot to mention I was using 11.5.

I built and installed 11.6 from the latest master branch and retried my above process. I don't see any difference in results.

Here's an example output from the initial run:

Parsed 954880389 bytes of 954880389 (100.00%), queries: 77567, events: 2733
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/20/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/21/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/22/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/23/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/24/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/25/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/26/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/27/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/28/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/29/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/09/30/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/01/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/02/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/03/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/04/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/05/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/06/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/07/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/10/08/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-38/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-39/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-40/...
LOG: Ok, generating global index to access incremental reports...`

..and one of a subsequent run an hour later:

Parsed 1377396 bytes of 1377396 (100.00%), queries: 142, events: 0
LOG: Ok, generating global index to access incremental reports...`

I assume /2021/10/08/index.html and /2021/week-40/index.html should have gotten re-generated as well.

hansgv commented 2 years ago

I discovered something interesting today, perhaps related. There appears to be a difference between how pgbadger executes in shell and cron. For some reason, when executed by cron, no HTML files are generated even when the outdir and report directory are initially empty.

darold commented 2 years ago

I don't see any reason outside that the cron line is not well interpreted. Can you post you pgbadger cron line too?

hansgv commented 2 years ago

@darold My cron line runs the same shell script I can successfully run manually from the shell. Running it from cron produces the .bin files and the global HTML index file but not the daily/weekly HTML files.

# 50 4 * * * /home/me/pgbadger/temp_pgbadger.sh

#! /bin/bash
ionice -c3 pgbadger \
--dbname k2 \
--jobs 2 \
--format rds \
--dns-resolv \
--incremental \
--extra-files \
--exclude-user myadmin \
--outdir '/my-pgbadger-reports' \
--html-outdir '/my-pgbadger-reports' \
/my-hourly-postgreslogs/postgresql.log*
darold commented 2 years ago

I can not reproduce the issue. You should check the result of the cron execution, file /home/me/pgbadger/temp_pgbadger.sh is probably not executable (see /var/spool/mail/me if cron has sent you a local email with the error message). I had this this problem and checking my local email cron was reporting error: /bin/sh: 1: /home/git/pgbadger/cron_pgbadger.sh: Permission denied

Closing.

hansgv commented 2 years ago

@darold My shell script is executable by cron (run as root). Cron execution doesn't produce errors. When executed by cron, pgbadger produces its usual output but generates only the global HTML index file. The daily/weekly HTML files are not generated when running the script over cron but they are generated when running the same script direct from shell.

darold commented 2 years ago

You should use latest development code, the issue is not reproducible. to download latest code use:

wget https://github.com/darold/pgbadger/archive/refs/heads/master.zip

or use git clone

hansgv commented 2 years ago

@darold First make sure I have the latest code:

wget https://github.com/darold/pgbadger/archive/refs/heads/master.zip
unzip master-zip
cd pgbadger-master
perl Makefile.PL
make
make install

Then run my whole script to start pgbadger report generation from scratch:

cp /my-hourly-postgreslogs/error/compressed/error/*.gz /my-hourly-postgreslogs/error/ && echo "Previously archived logs copied to working directory"
gunzip /my-hourly-postgreslogs/error/*.gz && echo "Archived logs in working directory gunzipped"
rm -R /my-pgbadger-reports/* && echo "Reports directory cleared"
pgbadger \
--dbname k2 \
--jobs 2 \
--format rds \
--dns-resolv \
--incremental \
--extra-files \
--exclude-user mysuperuser \
--outdir '/my-pgbadger-reports' \
--html-outdir '/my-pgbadger-reports' \
/my-hourly-postgreslog/error/postgresql.log*

Pgbadger output produced, when script executed by cron:

[========================>] Parsed 713254462 bytes of 713254462 (100.00%), queries: 70023, events: 1673
LOG: Ok, generating global index to access incremental reports...

HTML files generated, when script executed by cron:

find /my-pgbadger-reports -name "*.html"

-> No files

Pgbadger output produced, when script executed from bash shell (using the exact same postgres logs):

[========================>] Parsed 713254462 bytes of 713254462 (100.00%), queries: 61776, events: 1673
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/03/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/04/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/05/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/06/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/07/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/08/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/09/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/10/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/11/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/12/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/13/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/14/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/15/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/16/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/17/...
LOG: Ok, generating HTML daily report into /my-pgbadger-reports/2021/11/18/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-45/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-46/...
LOG: Ok, generating HTML weekly report into /my-pgbadger-reports/2021/week-47/...
LOG: Ok, generating global index to access incremental reports...

HTML files generated when script executed from shell:

/my-pgbadger-reports/index.html
/my-pgbadger-reports/2021/week-45/index.html
/my-pgbadger-reports/2021/11/05/index.html
/my-pgbadger-reports/2021/11/09/index.html
/my-pgbadger-reports/2021/11/03/index.html
/my-pgbadger-reports/2021/11/08/index.html
/my-pgbadger-reports/2021/11/11/index.html
/my-pgbadger-reports/2021/11/04/index.html
/my-pgbadger-reports/2021/11/15/index.html
/my-pgbadger-reports/2021/11/12/index.html
/my-pgbadger-reports/2021/11/07/index.html
/my-pgbadger-reports/2021/11/06/index.html
/my-pgbadger-reports/2021/11/13/index.html
/my-pgbadger-reports/2021/11/10/index.html
/my-pgbadger-reports/2021/11/14/index.html
/my-pgbadger-reports/2021/11/18/index.html
/my-pgbadger-reports/2021/11/16/index.html
/my-pgbadger-reports/2021/11/17/index.html
/my-pgbadger-reports/2021/week-46/index.html
/my-pgbadger-reports/2021/week-47/index.html
hansgv commented 2 years ago

@darold I finally figured out why I got different results when executing pgbadger from shell and via cron.

While the default shell in Ubuntu is bash the default shell for cron is sh. Apparently that does not play well when executing pgbadger.

Changing my crontab to /bin/bash -c "/my/home/my_pgbadger_script.sh" fixed it. :)