arkhipov / temporal_tables

Temporal Tables PostgreSQL Extension
BSD 2-Clause "Simplified" License
937 stars 47 forks source link

Query at time across multiple tables #37

Open jamiewinder opened 6 years ago

jamiewinder commented 6 years ago

Hi. I'm new to this extension and PostgreSQL in general. I know a little about temporal tables from SQL Server and have been investigating how viable it would be to port my database over.

From this tutorial, I know the standard syntax isn't supported:

SELECT * FROM subscriptions AS OF SYSTEM TIME '2014-01-10' WHERE id = 1;

and instead we can use a view:

SELECT * FROM subscriptions_with_history
WHERE id = 1 AND sys_period @> '2015-01-10'::timestamptz

This is fine for a single table, but if I were joining to another temporal table I'd have to repeat the sys_period clause for every temporal view. I'm wondering if there is a more elegant way to do this? Perhaps a view like:

CREATE VIEW subscriptions_at_tqt AS
SELECT * FROM subscriptions_with_history WHERE sys_period @> get_temporal_query_time()

Where get_temporal_query_time would have an equivalent set_temporal_query_time. This would mean I could do something like:

set_temporal_query_time('2017-11-01 10:00:00');

SELECT * FROM subscriptions_at_tqt
INNER JOIN product_at_tqt ON (...) 
INNER JOIN publisher_at_tqt  ON (...)

As far as I can see, this is similar to the existing set_system_time utility, but for querying rather than versioning. Does this idea have legs?

arkhipov commented 6 years ago

Hi @jamiewinder,

That idea does have legs. It was one of the two ideas I considered implementing. The other one was specifying the system time as a comment near the table it relates to, as in

SELECT * FROM subscriptions /* AS OF SYSTEM TIME '2014-01-10' */ WHERE id = 1;

I know a little about how the system time concept is implemented in SQL Server. I used IBM DB2 as a reference implementation, but as far as I can remember, the AS OF SYSTEM TIME clause is applied only to the table after which it is placed.

The main problem with introducing a function like set_temporal_query_time is to properly define its scope. For example, you may have a trigger or function which is used inside your query that also contains queries to temporal tables, which might want to use different temporal query times.

In order not to repeat the sys_period clause for every temporal table, you might want to define a view like:

CREATE VIEW my_view
SELECT ... FROM subscriptions
  INNER JOIN products ON  products.sys_period && subscriptions.sys_period AND ...
  INNER JOIN publishers ON publishers.sys_period && subscriptions.sys_period AND ...

Then you could do something like:

SELECT * FROM my_view WHERE sys_period @> '2015-01-10'::timestamptz
jamiewinder commented 6 years ago

That's great, thanks! I think I assumed incorrectly about AS OF SYSTEM TIME (i.e. that it was for all tables in the query). Now you've pointed it out it makes sense that it isn't always something you'd want to do.

Would putting the AS OF clause in a comment like the example you gave actually apply the clause in the example you gave? Sounds like a great idea if so!

I'll investigate the view idea you gave - I am still getting used to these range operators!

Thanks again.

arkhipov commented 6 years ago

Would putting the AS OF clause in a comment like the example you gave actually apply the clause in the example you gave? Sounds like a great idea if so!

No, not at the moment.