Open andness opened 5 years ago
This isn't possible, but you can create the continuous aggregate using the normal time bucket and then do time_bucket_gapfill when you query the view, this should pretty closely approximate what you're trying to do.
@davidkohn88, I don't think that would solve my case because I need to join on the time column, and to make that efficient I need it indexed.
To give a little background on what I'm doing: I have a property of my data which is stored as intervals in the source data, i.e. "The device was in state X between time T1 and T2". So the way I solve this is brute-force: For each hour of the day I materialize the state that the device is in for that hour. So this means that the rows:
["Device1", "StateA", 12:20, 17:30] ["Device1", "StateB", 17:30, 21:10]
Expand into
["Device1", "StateA", 12:00] ["Device1", "StateA", 13:00] ["Device1", "StateA", 14:00] ["Device1", "StateA", 15:00] ["Device1", "StateA", 16:00] ["Device1", "StateA", 17:00] ["Device1", "StateB", 18:00] ["Device1", "StateB", 19:00] ["Device1", "StateB", 20:00] ["Device1", "StateB", 21:00] ["Device1", "StateB", 22:00]
Using this approach I can get the state of the device for any given hour of the day by a simple join against this table.
Not the most intelligent approach, but it's simple and straightforward at the expense of using some disk space. I tried using a join query that would essentially look up the last device state value for each hour of the day and as you can imagine that query did not perform very well.
that only works if you do not have State changes during an full hour, otherwise it will be inaccurate
I didn't see the previous comment until now, but I will also say that the join, if structured correctly (as a lateral with order by time desc limit 1, and the correct indexes) should be reasonably performant...You could also build a GIST index on a range expression as well as the device_id (with btree_gist extension) and then do a containment query that should also be reasonably performant.
I think there are some other issues around this that I remember hearing about, namely, wanting to calculate the time spent in a given state over a period of time...that one is a little harder to accomplish nicely right now, and I've opened an issue to do that sort of thing in Timescale Analytics: https://github.com/timescale/timescale-analytics/issues/117
This is definitely a pain point for us as well. We have to include it outside in the query on the cagg, which complicates the SQL statements we have since we have to double up on some aggregate functions (which don't end up doing anything but makes the query more complicated).
Hi, I need it for my use-case because selecting it in runtime is not productive enough, are there any plans to release the functionality?
I'm also having this same issue, and same requirement as @Mykyta-Chernenko . We need to gapfill within the continuous aggregate for performance reasons.
+1 would be appreciated really
+1, we'd like to use state_agg()
and just the LOCF the previous state into next bucket where is no state change otherwise. https://github.com/timescale/timescaledb-toolkit/issues/479#issuecomment-1197493044 things people do seem complicated.
+1 bummed to discover this has been an open issue for almost 5 years! The workaround isn't doing it for me.
+1 would be very useful to have this feature.
+1. This feature would be great and simplify some of our queries and improve overall performance.
Would be very handy to configure refresh policy to run at every start of bucket and then it could always create the first sample already there.
Would love this feature, I have the exact same problem that @andness mentioned at the start. Would love to know if there's a viable solution.
+1 on this feature.
This problem has been going on for 5 years, why hasn't it been resolved yet!!!
gapfill is an extremely important component of timeseries to get linear space. Too bad it can't be used in caggs, why is that? Is it deemed too computional expensive?
I tried to create a continuous aggregate using time_bucket_gapfill but was disappointed to find this was not possible. Are there plans to support this?