darold / pgbadger

A fast PostgreSQL Log Analyzer
http://pgbadger.darold.net/
PostgreSQL License
3.44k stars 347 forks source link

can not load incompatible binary data, binary file is from version < 4.0. #770

Open bagafoot opened 1 year ago

bagafoot commented 1 year ago

Hello, Im stuck in this problem, what am I doing wrong pgbadger version 12.1 (installed from source) log_destination: csvlog log_line_prefix: t=%m p=%p a=%a h=%h db=%d u=%u x=%x log_filename: test_pg12db01-%Y.%m.%d.log parse command:

pgbadger --prefix "t=%m p=%p a=%a h=%h db=%d u=%u x=%x " -v -a 1 -f csv -s 2 -T "test58" -q -I -O /var/lib/pgsql/12/pgbadger/ --exclude-query="^(COPY|refresh materialized VIEW|VACUUM)" -o test58.html /pg01/mounts/pg12/pg_log/log/test_pg12db-$(date --date='yesterday' +'%Y.%m.%d' ).csv.gz

Max. recursion depth with nested structures exceeded at /usr/lib64/perl5/vendor_perl/Storable.pm line 328, <$lfile> line 723144, at /bin/pgbadger line 15472.
DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED
DEBUG: the log statistics gathering took:42 wallclock secs ( 0.00 usr  0.00 sys + 41.26 cusr  0.37 csys = 41.63 CPU)
DEBUG: Building incremental report for 2023-03-20
can not load incompatible binary data, binary file is from version < 4.0.
can not load incompatible binary data, binary file is from version < 4.0.
LOG: Ok, generating HTML daily report into /var/lib/pgsql/12/pgbadger//2023/03/20/...
can not load incompatible binary data, binary file is from version < 4.0.
can not load incompatible binary data, binary file is from version < 4.0.
LOG: Ok, generating HTML weekly report into /var/lib/pgsql/12/pgbadger//2023/week-13/...
Wide character in print at /bin/pgbadger line 11634.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11668.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 11732.
Wide character in print at /bin/pgbadger line 12272.
Wide character in print at /bin/pgbadger line 12272.
Wide character in print at /bin/pgbadger line 12272.
Wide character in print at /bin/pgbadger line 12272.
LOG: Ok, generating global index to access incremental reports...
DEBUG: writing global index into /var/lib/pgsql/12/pgbadger//index.html
DEBUG: building reports took:  9 wallclock secs ( 8.70 usr +  0.12 sys =  8.82 CPU)
DEBUG: the total execution time took: 51 wallclock secs ( 8.70 usr  0.12 sys + 41.26 cusr  0.37 csys = 50.45 CPU)
darold commented 1 year ago

Hi,

Does content of the /var/lib/pgsql/12/pgbadger/ directory have been copied from another server before running pgbadger? If not, can you report the result of the following command:

find  /var/lib/pgsql/12/pgbadger/2023/03/ -name '*.bin' | xargs -i ls -la {}
bagafoot commented 1 year ago

no, but I changed these settings log_destination: csvlog log_line_prefix: t=%m p=%p a=%a h=%h db=%d u=%u x=%x log_filename: test_pg12db01-%Y.%m.%d.log

[postgres@POSTGRESQL06F001 ~]$ find  /var/lib/pgsql/12/pgbadger/2023/03/ -name '*.bin' | xargs -i ls -la {}
-rw-r--r--. 1 postgres postgres 4135073 Mar 20 00:01 /var/lib/pgsql/12/pgbadger/2023/03/19/2023-03-19-2310819.bin
-rw-r--r--. 1 postgres postgres 2526 Mar 20 00:01 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-2310819.bin
-rw-r--r--. 1 postgres postgres 16301062 Mar 21 00:01 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-314234.bin
-rw-r--r--. 1 postgres postgres 18134890 Mar 21 14:46 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-3292169.bin
-rw-r--r--. 1 postgres postgres 13774076 Mar 21 14:54 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-3314737.bin
-rw-r--r--. 1 postgres postgres 17419948 Mar 21 15:21 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-3382075.bin
-rw-r--r--. 1 postgres postgres 16464563 Mar 21 15:36 /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-3493256.bin
-rw-r--r--. 1 postgres postgres 18569315 Mar 23 11:18 /var/lib/pgsql/12/pgbadger/2023/03/22/2023-03-22-2941647.bin
-rw-r--r--. 1 postgres postgres 14461883 Mar 23 11:29 /var/lib/pgsql/12/pgbadger/2023/03/22/2023-03-22-2975206.bin
-rw-r--r--. 1 postgres postgres 18564003 Mar 23 11:35 /var/lib/pgsql/12/pgbadger/2023/03/22/2023-03-22-2997205.bin
-rw-r--r--. 1 postgres postgres 12738676 Mar 23 11:40 /var/lib/pgsql/12/pgbadger/2023/03/22/2023-03-22-3017334.bin
bagafoot commented 1 year ago

I removed all files from pgbadger directory but still getting the same error and generate a fully emty html.

pgbadger -f csv -I -v -a 1 -s 2 -T "test58" -O /var/lib/pgsql/12/pgbadger --exclude-query="^(COPY|refresh materialized VIEW|VACUUM)" -o cbs58.html /pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz

DEBUG: pgBadger version 12.0.
DEBUG: Output 'html' reports will be written to /var/lib/pgsql/12/pgbadger/cbs58.html
DEBUG: Looking for file size using command: gunzip -l "/pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz" | grep -E '^\s*[0-9]+' | awk '{print $2}'
DEBUG: Starting progressbar writer process
DEBUG: pgBadger will use log format csv to parse /pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz.
DEBUG: timezone not specified, using -10800 seconds
DEBUG: Processing log file: /pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz
DEBUG: Compressed log file, will use command: gunzip -c "/pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz"
DEBUG: Starting reading file "/pg01/mounts/pg12/pg_log/log/test_pg12db01-2023.03.20.csv.gz"...
Max. recursion depth with nested structures exceeded at /usr/lib64/perl5/vendor_perl/Storable.pm line 328, <$lfile> line 723144, at /usr/local/bin/pgbadger line 15695.
DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED

DEBUG: the log statistics gathering took:71 wallclock secs ( 0.00 usr  0.01 sys + 69.36 cusr  0.27 csys = 69.64 CPU)
DEBUG: Building incremental report for 2023-03-20
can not load incompatible binary data, binary file is from version < 4.0.
LOG: Ok, generating HTML daily report into /var/lib/pgsql/12/pgbadger/2023/03/20/...
can not load incompatible binary data, binary file is from version < 4.0.
LOG: Ok, generating HTML weekly report into /var/lib/pgsql/12/pgbadger/2023/week-13/...
LOG: Ok, generating global index to access incremental reports...
DEBUG: writing global index into /var/lib/pgsql/12/pgbadger/index.html
DEBUG: building reports took:  0 wallclock secs ( 0.09 usr +  0.03 sys =  0.12 CPU)
DEBUG: the total execution time took: 71 wallclock secs ( 0.09 usr  0.04 sys + 69.36 cusr  0.27 csys = 69.76 CPU)
darold commented 1 year ago

I think that the binary file generated for day2023-03-20 is created with some corruption: /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-2310819.bin if you skip this day from pgbadger analyze you might not encounter the problem again.

If possible, send file /var/lib/pgsql/12/pgbadger/2023/03/20/2023-03-20-2310819.bin to my private email ( gilles AT darold DOT net ) I will try to find the reason.

harukat commented 2 months ago

Hello,

I have encountered a similar phenomenon. If I generate a large binary file (>500MB) from a large log file (> 1.5 GB) and try to generate a report from that binary file, it fails with the same error. I was able to work around the problem by splitting the original log file to create smaller binary files (about 300MB each) and then generating a report specifying multiple binary files.

Regarding binary file generation, I would like the documentation to clearly state the maximum size limit that can be safely used.

darold commented 2 months ago

From https://perldoc.perl.org/Storable

huge objects

    On 64bit systems some data structures may exceed the 2G (i.e. I32_MAX) limit. On 32bit systems also strings between I32 and U32 (2G-4G). Since Storable 3.00 (not in perl5 core) we are able to store and retrieve these objects, even if perl5 itself is not able to handle them. These are strings longer then 4G, arrays with more then 2G elements and hashes with more then 2G elements. cperl forbids hashes with more than 2G elements, but this fail in cperl then. perl5 itself at least until 5.26 allows it, but cannot iterate over them. Note that creating those objects might cause out of memory exceptions by the operating system before perl has a chance to abort.

I don't know if you are in this case. What was the size of the original PostgreSQL log file and haw much days of logs does it contain?

harukat commented 2 months ago

The text log file I have for the failing case contains logs from 2024-04-25 04:00:05.233 to 2024-04-25 05:00:00.010 and is 3 M lines / 1.7 GB. Splitting this into two parts, the first 1.5 M lines (851MB) and the rest, produced correct binary files of 331MB and 289MB, respectively. When generating the binary file, --exclude-query filtering is performed and --maxlength is 50000. And $Storable::VERSION is 3.25.

(updated)

darold commented 2 months ago

I guess this is 1 day log file. I use to process 10GB and more log files regularly without any issue. I think that this is not the real source of the problem. Please, post the pgbadger command line you are using and one or 2 lines of your log file.

harukat commented 2 months ago

Sorry. I made a mistake in copy and paste in the previous post. The correct end time is 2024-04-25 05:00:00.010. It is a 1 hour log file. I have also updated the previous post.

harukat commented 2 months ago

Here are 1st log line, last log line, and the pghadger (v12.4) command line. Since this is a production system SQL statement, the identifiers have been rewritten.

2024-04-25 04:00:05.233 JST [2951961]: [4-1] db=LDXXX0101,user=XXX_ZZZKSxxx,AppName=[unknown],client=10.1.2.30 [0] LOG:  duration: 0.127 ms  parse <unnamed>: SET extra_float_digits = 3
2024-04-25 05:00:00.010 JST [2951845]: [1495975-1] db=XXXXA0201,user=XXX_ZZZZZkQw,AppName=PostgreSQL JDBC Driver,client=10.1.2.24 [244542076] LOG:  duration: 14.513 ms  execute <unnamed>: SELECT A.xxxx_id, A.name, B.ttttt_3 as ttttt, C.sssss_data, C.sssss_xxxxx_data, COALESCE(C.option,0) AS option, COALESCE(C.xxxxxx2,0) AS xxxxxx2 FROM tbl_2dx31_xx_data A, tbl_2dx31_xxxxx_dp_ttttt B, tbl_2dx31_xxxxx_28th_dp_sssss C WHERE A.iidx_id=B.xxxx_id AND A.xxxx_id=C.xxxx_id AND A.max_xxxxx_id_dp=16 AND B.xxxxx_id=28082 AND B.xxxxx_id=C.xxxxx_id AND B.xxxxx_3=2579.0 AND C.class_id=8
/usr/local/bin/pgbadger \
  --exclude-query="^(SELECT 1|COPY|BEGIN|ROLLBACK|COMMIT)" \
  --prefix="%m [%p]: [%l-1] db=%d,user=%u,AppName=%a,client=%h [%x]" \
  -j 8 \
  --no-progressbar \
  -q \
  -O ./log \
  -o pgbadger_2024-04-25-04.bin \
  -f stderr \
  -Z +9 \
  --maxlength 50000 \
  --verbose \
  --exclude-appname "psql,pgBackRest,vacuumdb" \
  --exclude-db "template0,template1,postgres" \
  --include-time "2024-04-25 04.*" \
  ./log/postgresql-2024-04-25_04.log 2> ./log/pgbadger.log
log_line_prefix = '%m [%p]: [%l-1] db=%d,user=%u,AppName=%a,client=%h [%x] '    
darold commented 2 months ago

Well I don't see any reason why you have this problem and I can not reproduce. If by chance you can send to my private email gilles AT darold DOT net a link to download the log file I will be able to see what happens otherwise I'm afraid I can not do more on this issue except waiting that it happens to me.

harukat commented 1 month ago

Thank you for your research. We will talk to the person in charge of the production system about providing the log file, and also try to create an artificial reproduction case.