Open sb230132 opened 2 years ago
Below commit on PG15 has caused incompatible changes with respect to numeric aggregates.
commit f025f2390e13d7da69da595086fb982bbaf7f329
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Date: Mon Jul 5 10:16:42 2021 +0100
Prevent numeric overflows in parallel numeric aggregates.
Formerly various numeric aggregate functions supported parallel
aggregation by having each worker convert partial aggregate values to
Numeric and use numeric_send() as part of serializing their state.
That's problematic, since the range of Numeric is smaller than that of
NumericVar, so it's possible for it to overflow (on either side of the
decimal point) in cases that would succeed in non-parallel mode.
Fix by serializing NumericVars instead, to avoid the overflow risk and
ensure that parallel and non-parallel modes work the same.
A side benefit is that this improves the efficiency of the
serialization/deserialization code, which can make a noticeable
difference to performance with large numbers of parallel workers.
No back-patch due to risk from changing the binary format of the
aggregate serialization states, as well as lack of prior field
complaints and low probability of such overflows in practice.
Patch by me. Thanks to David Rowley for review and performance
testing, and Ranier Vilela for an additional suggestion.
Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com
One of the comments in above commit states this:
/*
* numericvar_serialize - serialize NumericVar to binary format
*
* At variable level, no checks are performed on the weight or dscale, allowing
* us to pass around intermediate values with higher precision than supported
* by the numeric type. Note: this is incompatible with numeric_send/recv(),
* which use 16-bit integers for these fields.
*/
Because of above change when parsing binary data, the length is read as 65536 bytes instead of 1. This causes the error "insufficient data left in message".
Code which reads length: PG14:
len = (uint16) pq_getmsgint(buf, sizeof(uint16));
PG15:
len = pq_getmsgint(buf, sizeof(int32));
One proposal is the following:
Vendor some of the PG13 numeric serialization/deserialization functions into our own timescaledb repository in the import/
folder and copy the logic of https://github.com/timescale/timescaledb/pull/4451/files
Then do a try/catch loop in the following manner:
One proposal is the following:
Vendor some of the PG13 numeric serialization/deserialization functions into our own timescaledb repository in the
import/
folder and copy the logic of https://github.com/timescale/timescaledb/pull/4451/filesThen do a try/catch loop in the following manner:
* first try to use the default current postgresql code for numeric serialization/deserialization (PG15 in this scenario) * if there is an exception try to use the PG14 vendored corresponding functions * if there is an exception again try to repair the numeric partial state on the fly as done in https://github.com/timescale/timescaledb/pull/4451/files cc @horzsolt @fabriziomello @mkindahl
Even though this seems like a workable solution, I am slightly worried that we can accidentally generate incorrect data. In most cases, we should capture issues in the decoding, but the more different alternative encodings we have, the higher the chance is that we just by accident get a decodable result that is wrong. Even if unlikely, just the fact that there is a risk of getting back a bad result reduces the trust in the system.
Didn't we add version information for the continuous aggregates in the metadata when we fixed this last time? Can't we leverage that to ensure that we are decoding PG14 partials? When we discussed this before, I was proposing that we add this, but not sure what became of it. I would have expected something like this:
Version | Description |
---|---|
NULL | pre-PG14 partials |
1 | PG14 partials |
1.1 | PG15 partials |
2 | Finals |
If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).
We do not have version information on the actual stored partial. In theory, a materialized hypertable could easily have 3 different versions of stored partials concurrently.
If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).
Since we don't have versioning information and we don't have logic to detect the version this proposal still relies on luck and doesn't solve the theoretical problem.
Below tests is related to this issue. make installcheck TESTS="partialize_finalize"
If we do not have version information, a better approach, if possible, would be to have dedicated functions to rewrite the result from PG13 -> PG14 and PG14 -> PG15 would be safer. Then the user knows that this is upgraded from a previous version and there is a very low risk of making a mistake (basically, if you forgot that you called it and call it again, and are very unlucky).
Since we don't have versioning information and we don't have logic to detect the version this proposal still relies on luck and doesn't solve the theoretical problem.
If we do not have versioning information, we have to solve it this way. There is little point in adding versioning at this point, since we are going to deprecate the partial format.
What type of bug is this?
Incorrect result
What subsystems and features are affected?
Query executor
What happened?
make installcheck TESTS="partialize_finalize"
Above test fails with error "insufficient data left in message". After COPY operation copies binary data into a table, SELECT with aggregate function on numeric types returns error.
TimescaleDB version affected
2.9.0
PostgreSQL version used
15.0
What operating system did you use?
Mac OS X
What installation method did you use?
Source
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?