Open WalkerWalker opened 1 year ago
Hope to get some discussions
Hey @WalkerWalker, thanks for reaching out.
There are an ongoing PR already: https://github.com/timescale/timescaledb/pull/4874
There are an ongoing PR already: https://github.com/timescale/timescaledb/pull/4874
Thanks for the reply. After reading this PR, my understanding is that this is allowing continuous aggregation on a join result. If my understanding is correct, then it is not what my feature request is about. This feature request is to compute the join, not to compute the continuous aggregation on top of join.
Here is a simple reason why computing this join is hard. The join result can be as big as the sensor data
and therefore it is impractical to compute i in one go if sensor data
is too large (for example over 1TB before compression). Hence I hope the join result can be continuously, or incrementally, computed.
I did a bit of research and notice that this might be what I want. https://github.com/sraoss/pg_ivm
would be nice to get some feedback, is this feature requested understood? meaningful? implementable?
There are an ongoing PR already: #4874
Hello @fabriziomello Now that this PR is merged, could you please shine a light on this issue? Do you think this new feature of 2.10.0 can solve my problem? Again, I only want to compute the join, but not aggregation on top of the join. Do you think I can use a super small bucket bucket, for example 10ms
so that no down-sampling is actually performed ?
There are an ongoing PR already: #4874
Hello @fabriziomello Now that this PR is merged, could you please shine a light on this issue? Do you think this new feature of 2.10.0 can solve my problem? Again, I only want to compute the join, but not aggregation on top of the join. Do you think I can use a super small bucket bucket, for example
10ms
so that no down-sampling is actually performed ?
Currently we don't support Continuous Aggregate without time dimension and aggregation. So there's no option other than add a small bucket. Also the current JOIN support is restricted to INNER JOIN with equality operator. Have a look at the documentation.
What problem does the new feature solve?
I have a time series table of sensor data and I have an assignment table defining at what time the device is worn by which person. Now I am interested in the data of one or many people.
What does the feature do?
say
sensor_data
table has columns(time, device_id, temperature, location_lang, location_lat)
andassignments
table has columns(assignment_id, person_id, device_id, time_range)
I want a materialized view of the following query.
Apparently the query result can be as big as the
senor_data
table, making it impractical to run once thesenor_data
table is too big (over 1TB before decompression). But the query result should be very stable. Essentially as the new records coming into thesensor_data
table, the materialized view table should also have new records, according to the assignement table. But if history never changes and assignments are never modified, the old data stays the same.Maybe the result can be materialized day by day (chunk by chunk) and having the lastest day result queried live? Should I call this continuous joining and filtering?
Implementation challenges
No response