weird number interpretation

sandroden commented 1 month ago

I'm starting using pg_filedump and it's possible what I report is out of my ignorance rather than a bug, sorry in that case.

I have a simple db, single table, that is a stripped version of real data:

CREATE TABLE public."rdebug" (
    job_code text,
    cons text,
    data date,
    tipo text,
    qta numeric(10,2),
    unita text,
    p_a text,
    descr text,
    cambio double precision,
    att_code text,
    spunta boolean,
    id integer NOT NULL,
    giorno_id integer,
    phase integer
);

COPY public."rdebug" (job_code, cons, data, tipo, qta, unita, p_a, descr, cambio, att_code, spunta, id, giorno_id, phase) FROM stdin;
SW/FOSS sd  2000-01-31  O   3.00    h   \N  \N  1       t   274 463 \N
SW/FOSS sd  2000-01-31  O   3.00    h   \N  \N  1       t   274 463 \N
SW/FOSS sd  2007-01-10  O   3.00    h   \N  sito    99989   code    \N  3783    2045    \N
SW/FOSS \N  \N  \N  8.00    \N  \N  sqlkit  \N  \N  \N  4962    2317    \N
\.

I inspect if with pg_filedump:

root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 16/dbg/base/5/16384
* Options used: -D 
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)
 Block: Size 8192  Version    4            Upper    7864 (0x1eb8)
 LSN:  logid      0 recoff 0x015094d0      Special  8192 (0x2000)
 Items:    4                      Free Space: 7824
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 40

<Data> -----
 Item   1 -- Length:   84  Offset: 8104 (0x1fa8)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.00    h   \N  \N
 Item   2 -- Length:   84  Offset: 8016 (0x1f50)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.E9    h   \N  \N
 Item   3 -- Length:   88  Offset: 7928 (0x1ef8)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2007-01-10  O   3.E9    h   \N  sito
 Item   4 -- Length:   60  Offset: 7864 (0x1eb8)  Flags: NORMAL
COPY: SW/FOSS   \N  \N  \N  8.E9    \N  \N  sqlkit  \N  \N  \N  4962    2317    \N

*** End of File Encountered. Last Block Read: 0 ***

quantity ( qta numeric(10,2),) should be 3.00 in the first 3 rows but I read 3.00, 3.E9, 3.E9

if I rerun it w/o any modification I get that number incremented:

root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384|grep SW/
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.00    h   \N  \N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.I2    h   \N  \N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2007-01-10  O   3.I2    h   \N  sito
COPY: SW/FOSS   \N  \N  \N  8.I2    \N  \N  sqlkit  \N  \N  \N  4962    2317    \N
root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384|grep SW/
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.00    h   \N  \N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2000-01-31  O   3.F0    h   \N  \N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS sd  2007-01-10  O   3.F0    h   \N  sito
COPY: SW/FOSS   \N  \N  \N  8.F0    \N  \N  sqlkit  \N  \N  \N  4962    2317    \N

Am I wrong in creating the attr list? What is the tuple that is not decoding correctly?

When I'll be able to get consistent outputs I should apply it to the real data that was dump with pg12, should I re-compile it with headers from PostgreSQL 12 or can I just use the pg_filedump I compiled on mu Ubuntu 24:04 with headers form pg16?

Thanks for any hints

df7cb commented 1 month ago

Confirmed, there is something wrong in CopyAppendNumeric().

df7cb commented 1 month ago

Since the on-disk format didn't change for decades (or else pg_upgrade wouldn't work), you can use the 16 pg_filedump also with version 12. The bug will also be there, though.

sandroden commented 1 month ago

Thanks @df7cb, is this something will be addresses in a reasonable time or I'm better off not relying on this fix? Please don't take me wrong, this is just to plan my strategies (or surrender... :-)

df7cb commented 1 month ago

I'll be looking at it, but I can't promise any deadlines.

GetsuDer commented 2 weeks ago

Probably worth mentioning that "double precision" type != "numeric" type - although in docs (https://www.postgresql.org/docs/current/datatype-numeric.html) they both are in category Numeric. As for now, pg_filedump does not mention double precision format in the list of supported types. (Although the value changing problem still persists if using "~" to skip the rest of the tuple. Hadn't figure out why yet.)

df7cb / pg_filedump

weird number interpretation #31