timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.51k stars 879 forks source link

ERROR: insufficient data left in message on PG15 #4937

Open sb230132 opened 1 year ago

sb230132 commented 1 year ago

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Query executor

What happened?

make installcheck TESTS="partialize_finalize"

Above test fails with error "insufficient data left in message". After COPY operation copies binary data into a table, SELECT with aggregate function on numeric types returns error.

TimescaleDB version affected

2.9.0

PostgreSQL version used

15.0

What operating system did you use?

Mac OS X

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

drop table t1 cascade;
create table t1 (a integer, b numeric, partialb bytea, partialc bytea, partiald bytea, partiale bytea, partialf bytea);
\COPY t1 FROM /tsl/test/sql/data/partialize_finalize_data.csv WITH CSV HEADER
select _timescaledb_internal.finalize_agg( 'sum(numeric)', null, null, null, partialb, null::numeric ) sumb from t1;
sb230132 commented 1 year ago

Below commit on PG15 has caused incompatible changes with respect to numeric aggregates.

commit f025f2390e13d7da69da595086fb982bbaf7f329
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Date:   Mon Jul 5 10:16:42 2021 +0100

    Prevent numeric overflows in parallel numeric aggregates.

    Formerly various numeric aggregate functions supported parallel
    aggregation by having each worker convert partial aggregate values to
    Numeric and use numeric_send() as part of serializing their state.
    That's problematic, since the range of Numeric is smaller than that of
    NumericVar, so it's possible for it to overflow (on either side of the
    decimal point) in cases that would succeed in non-parallel mode.

    Fix by serializing NumericVars instead, to avoid the overflow risk and
    ensure that parallel and non-parallel modes work the same.

    A side benefit is that this improves the efficiency of the
    serialization/deserialization code, which can make a noticeable
    difference to performance with large numbers of parallel workers.

    No back-patch due to risk from changing the binary format of the
    aggregate serialization states, as well as lack of prior field
    complaints and low probability of such overflows in practice.

    Patch by me. Thanks to David Rowley for review and performance
    testing, and Ranier Vilela for an additional suggestion.

    Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com

One of the comments in above commit states this:

/*
 * numericvar_serialize - serialize NumericVar to binary format
 *
 * At variable level, no checks are performed on the weight or dscale, allowing
 * us to pass around intermediate values with higher precision than supported
 * by the numeric type.  Note: this is incompatible with numeric_send/recv(),
 * which use 16-bit integers for these fields.
 */

Because of above change when parsing binary data, the length is read as 65536 bytes instead of 1. This causes the error "insufficient data left in message".

Code which reads length: PG14:

len = (uint16) pq_getmsgint(buf, sizeof(uint16));

PG15:

len = pq_getmsgint(buf, sizeof(int32));
mfundul commented 1 year ago

One proposal is the following:

Vendor some of the PG13 numeric serialization/deserialization functions into our own timescaledb repository in the import/ folder and copy the logic of https://github.com/timescale/timescaledb/pull/4451/files

Then do a try/catch loop in the following manner:

mkindahl commented 1 year ago

One proposal is the following:

Vendor some of the PG13 numeric serialization/deserialization functions into our own timescaledb repository in the import/ folder and copy the logic of https://github.com/timescale/timescaledb/pull/4451/files

Then do a try/catch loop in the following manner:

* first try to use the default current postgresql code for numeric serialization/deserialization (PG15 in this scenario)

* if there is an exception try to use the PG14 vendored corresponding functions

* if there is an exception again try to repair the numeric partial state on the fly as done in https://github.com/timescale/timescaledb/pull/4451/files
  cc @horzsolt @fabriziomello @mkindahl

Even though this seems like a workable solution, I am slightly worried that we can accidentally generate incorrect data. In most cases, we should capture issues in the decoding, but the more different alternative encodings we have, the higher the chance is that we just by accident get a decodable result that is wrong. Even if unlikely, just the fact that there is a risk of getting back a bad result reduces the trust in the system.

Didn't we add version information for the continuous aggregates in the metadata when we fixed this last time? Can't we leverage that to ensure that we are decoding PG14 partials? When we discussed this before, I was proposing that we add this, but not sure what became of it. I would have expected something like this:

Version Description
NULL pre-PG14 partials
1 PG14 partials
1.1 PG15 partials
2 Finals

If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).

mfundul commented 1 year ago

We do not have version information on the actual stored partial. In theory, a materialized hypertable could easily have 3 different versions of stored partials concurrently.

mfundul commented 1 year ago

If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).

Since we don't have versioning information and we don't have logic to detect the version this proposal still relies on luck and doesn't solve the theoretical problem.

sb230132 commented 1 year ago

Below tests is related to this issue. make installcheck TESTS="partialize_finalize"

mkindahl commented 1 year ago

If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).

Since we don't have versioning information and we don't have logic to detect the version this proposal still relies on luck and doesn't solve the theoretical problem.

If we do not have versioning information, we have to solve it this way. There is little point in adding versioning at this point, since we are going to deprecate the partial format.