Open nelsonic opened 1 year ago
@LuchoTurtle my reading of the docs suggests that Materialized Views in Postgres
are only updated when requested i.e. the data in the view is not real-time. Can you confirm this with a bit of Gooooogling
...? 💭
Yes, materialized views (as the name implies) are just the results of a query saved/cached somewhere. Postgres will never automatically refresh the materialized view unless explicitly told.
The whole point of materialized views is to have fast queriable data that stem from expensive operations/queries from the database. So the main thing with materialized views is:
REFRESH MATERIALIZED VIEW
command truncates/removes all the data and re-inserts with the most up-to-date (I think this is even written on the link you've mentioned and the whole point of why they did data aggregation there).In fact, the link you sent mentions a few strategies of data aggregation so they don't waste the previously materialized data only to be materialized again. They range from sliding window verification of new inserts
/updates
to simple UNION ALL
where only new data is appended to what was previously materialized.
But yes, materialized views are only as accurate as the last time they ran the query they are caching.
Cool, that was my reading of the docs. Thanks for Summarising it succinctly, LuchoGPT. 😜
A side note: you have "VIEW", "MATERIALIZED VIEW" and CTE (the "WITH" statement).
"VIEW" and "MATERIALIZED VIEW" are like snapshots, creating virtual tables. With "CREATE VIEW", you do not store data physically, only cached. You can "UPDATE" it. With "MATERIALIZED VIEW", the data is physically stored. To update it, you "REFRESH".
You also have "CTE", the "WITH" statement. This is used for "inline" subqueries, breaking down the main query.
@ndrean for sure! Though VIEW
does not cache any data, it just abstracts complexity from the query itself. But yeah, I understood what you meant!
CTEs are really useful to organize complex views instead of having queries inside of queries, I second that 👏
@LuchoTurtle Thanks for your precision, it was essential in fact.
Indeed (and I was wrong..), VIEW queries and CTE are evaluated each time they are referenced, whilst MATERIALIZED VIEW queries are cached (because stored).
Between CTE and VIEW, I understand the main difference is that you can restrict the users' access to only the data from the views - they may not be able to access data in the underlying tables with "GRANT/DENY"- whilst they can access the underlying table when a CTE is used. You can deny modifier access to the underlying table with the "WTH CHECK OPTION"
A (partial) toy example I played with that I hope makes sense:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(20),
age INT,
active BOOLEAN
);
INSERT INTO users (name, age, active) VALUES ('me', 2, true);
...
A complex "inlined subquery"
WITH active_users AS (
SELECT name, age
FROM users
WHERE active = true
)
SELECT name FROM active_users;
A "complex" base data:
CREATE VIEW active_users_v AS
SELECT name, age
FROM users
WHERE active = true;
This first query is executed every time it is referenced, such as:
SELECT name, age FROM active_users_v;
Permissions restriction:
GRANT SELECT ON active_users_v TO lucho;
DENY SELECT ON users TO lucho;
They are used for performance reasons when some not very frequently changing but complex base data are used frequently in other queries:
Cache "complex" :) base data:
CREATE MATERIALIZED VIEW active_users_mv AS
SELECT age, name
FROM users
WHERE active = true;
If you want to use updated data, use "REFRESH":
REFRESH MATERIALIZED VIEW active_users_mv;
Otherwise, just use the MV:
SELECT name, age FROM active_users_mv;
Also to note that Materialized Views support indexing.
About the REFRESH
for materialized views coupled with triggers:
To avoid refreshing all these (potentially) costly tables, one may use the Postgres TRIGGERS to update some rows of the "MV" (at the expense of lengthening another query of course).
The previous post references the following post:
What is a Materialized view and how is it useful? The wikipedia page on this is a decent starting point: https://en.wikipedia.org/wiki/Materialized_view
Question
How does this work in
Postgres
Is the data duplicated?Read: https://www.timescale.com/blog/how-postgresql-views-and-materialized-views-work-and-how-they-influenced-timescaledb-continuous-aggregates/