Open mosabua opened 1 year ago
thannk you @mosabua for this ticket
my answers below
We might have to clarify this more .. e.g. does it work if the MV is in one catalog .. but the source table is in another catalog .. but both use Iceberg?
It won't work (will treat the table as "non-Iceberg"). Yes, we need to clarify that
Also what happens really when iceberg tables are outdated but other catalogs are involved .. what query is actually run?
Currently, when Iceberg tables are outdated, then "the view is known to be stale", and gets inlined (materialized state isn't used). This is consistent with m views on Iceberg tables only.
does manually running a refresh update all the data?
yes
Is that going to be a very heavy operation since it queries all source as on first refresh and overwrites all the data?
"very heavy" may mean different things to different people.
yes, it's equally expensive as the first refresh we don't have incremental refreshes at all, to the best of my knowledge, not even for m views on Iceberg tables solely, so no change here
So for incremental to work from what I understand not the MV and the source tables all have to be in the same catalog and it has to use the Iceberg connector..
Also @colebow and @bitsondatadev .. we potentially should update the docs for MVs in the SQL section to talk about the source query and how behavior may differ ..
So for incremental to work from what I understand not the MV and the source tables all have to be in the same catalog and it has to use the Iceberg connector..
No. For incremental to work, Trino needs to have "incremental m view fresh" feature. It doesn't exist at all yet to the best of my knowledge (https://github.com/trinodb/trino/issues/18673) Are you being confused by some Starburst proprietary solution?
Following up on #15108.
We might have to clarify this more .. e.g. does it work if the MV is in one catalog .. but the source table is in another catalog .. but both use Iceberg?
Also what happens really when iceberg tables are outdated but other catalogs are involved .. what query is actually run?
We should offer some guidance on what a user do .. e.g. if you know the underlying tables are outdated ... does manually running a refresh update all the data? Is that going to be a very heavy operation since it queries all source as on first refresh and overwrites all the data?
Ideally @colebow or @bitsondatadev can work with @findepi @raunaqmorarka and @claudiusli to get this clarified and updated