sraoss / pgsql-ivm

IVM (Incremental View Maintenance) development for PostgreSQL
Other
129 stars 12 forks source link

Fix to not use generate_series when views are queried #92

Closed yugo-n closed 4 years ago

yugo-n commented 4 years ago

In the previous implementation, multiplicity of each tuple was stored in ivm_count column in views. When SELECT was issued for views with duplicate, the view was replaced with a subquery in which each tuple was joined with generate_series function in order to output tuples of the number of ivm_count.

This was problematic for following reasons:

To resolve this, the new implementation doen't store multiplicities for views with tuple duplicates, and doesn't use generate_series when SELECT query is issued for such views.

Note that we still have to use ivm_count for supporting DISTINCT and aggregates.

Design:

Alothough views doesn't have ivm_count, multiplicities for old delta and new delta are calculated and the count is contained in each table.

The old delta is applied using ctid and row_number function, like:

DELETE FROM matviewname WHERE ctid IN ( SELECT tid FROM ( SELECT row_number() over (partition by c1, c2, ...) AS ivm_row_number, mv.ctid AS tid, diff.ivm_count__ FROM matviewname AS mv, old_delta AS diff " WHERE mv.c1 = diff.c1 AND mv.c2 = diff.c2 AND ... ) v WHERE v.ivm_row_number <= v.ivm_count__

The new delta is applied using generate_seriese, like:

INSERT INTO matviewname (c1, c2, ...) SELECT c1,c2,... FROM ( SELECT diff.*, generate_series(1, diff.__ivm_count__) FROM new_delta AS diff) AS v

Github Issue #69

yugo-n commented 4 years ago

Regression tests all pass. I'll merge it.