apache / cloudberry

One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.
https://cloudberry.apache.org
Apache License 2.0
472 stars 108 forks source link

Add FIXME about redundant param from 'ivm_visible_in_prestate' call. #578

Closed reshke closed 3 months ago

reshke commented 3 months ago

At first glance, i did get the neediness of this parameter. The visibility of tuple on QE does not depend on segment id, only on exported snapshot. So, this is clearly unneeded. But... why IVM test are not failing on this? This got me some more digging in CBDB internals, I actually think all this code is simply dead code for Cloudberry.

Prestate visibility in IVM is something that matters when multiple relation are modified via SINGLE statement. I don't think we can do this in Cloudberry:

1) Multiple write CTE are not possible, because there could be only one writer gang. Ref: https://github.com/cloudberrydb/cloudberrydb/commit/bfcb78829d50e31a49ce18d36d9c3af9b8a2ffe3 2) Statement triggers are also prohibited, I get 'Triggers for statements are not yet supported.' error with them.

Are there any other way for multiple relation modification in PostgreSQL that works for GP/CBDB?

And, yes, even when this will (if ever) be supported, gpseg id is anyway redundant.

reshke commented 3 months ago
1. Multiple write CTE are not possible, because  there could be only one writer gang. Ref: [bfcb788](https://github.com/cloudberrydb/cloudberrydb/commit/bfcb78829d50e31a49ce18d36d9c3af9b8a2ffe3

Actuality, I'm looking forward to fix this one day. This may be helpful for our projections work. I will maybe tell more about this later in separate thread.

reshke commented 3 months ago

Also, we need to delete 4th arg from here https://github.com/cloudberrydb/cloudberrydb/blob/main/src/include/catalog/pg_proc.dat#L12509 But this requires catalog change, which means pg_upgrade. So, i omit it

yjhjstz commented 3 months ago

Also, we need to delete 4th arg from here https://github.com/cloudberrydb/cloudberrydb/blob/main/src/include/catalog/pg_proc.dat#L12509 But this requires catalog change, which means pg_upgrade. So, i omit it

+ERROR: function pg_catalog.ivm_visible_in_prestate(oid, tid, oid) does not exist 7264

reshke commented 3 months ago

Also, we need to delete 4th arg from here https://github.com/cloudberrydb/cloudberrydb/blob/main/src/include/catalog/pg_proc.dat#L12509 But this requires catalog change, which means pg_upgrade. So, i omit it

is there any problem for not delete 4th arg?

Nothing except confusion of its existence

reshke commented 3 months ago

Hmm, tests actually failing... need to investigate further...

reshke commented 3 months ago

So, okay. I was 100 % sure I tested UDF with multiple modification, but turns out I wasn't. This seem to be only case where ivm_visible_in_prestate used. However, all my objections remains true, so i added FIXME about this issue to fix it in the future. Also, debug print statement was added, because it will be really helpful for development purposes IMO.

yjhjstz commented 3 months ago

OK, you may need change commit comments.

reshke commented 3 months ago

OK, you may need change commit comments.

It is better to rewrite final commit message when Squash & merge PR. I always do it this way in my projects. This way we preserve dev history (we avoid force-pushed, which needed to alter commit msg), which may be useful.

Final commit message may look like this:

Add FIXME about redundant parameter in from function 'ivm_visible_in_prestate' call.

ivm_visible_in_prestate function is used to determine tuple visibility for IVM in case multiple relation were modified within single statement. 
Even though ivm_visible_in_prestate only uses 3 out of 4 its parameters to determine tuple visibility, we postpone changes to its definition until next major release because it requires catalog change. 

Also, small debug print message was added, for dev purposes. 
avamingli commented 3 months ago

Are there any other way for multiple relation modification in PostgreSQL that works for GP/CBDB?

Have a try with rules or triggers?

reshke commented 3 months ago

Are there any other way for multiple relation modification in PostgreSQL that works for GP/CBDB?

Have a try with rules or triggers?

AFTER STATEMENT triggers are not supported. But create rule ... as on insert to ttt1 do also insert into ... really works