timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.84k stars 852 forks source link

[Enhancement]: <Reset option to Time_Bucket Origin for intraday contiguous data across date changes> #4143

Open SuperDaveOsbourne opened 2 years ago

SuperDaveOsbourne commented 2 years ago

What type of enhancement is this?

API improvement

What subsystems and features will be improved?

Continuous aggregate

What does the enhancement do?

If using time_bucket / time_bucket_ng for data like stock market feeds, equal division will result in a mod error that carries over the next day. Example with the US markets is that it trades between 9:30 and 4 PM Eastern Time. That is 390 minutes and if you break that into chartable segments that are machine learning friendly like 65 minute, 195 minute to remove artifacting and aliasing. Nyquist sampling problems are a large source of noise in market data. This problem exists where ever the buckets don't equally divide into 24 hours.

So the first bucket starts off where you would think if you use a 9:30 origin. But subsequent days it will start the buckets 10 20 30 minutes earlier because of the mod error over 24 hours. I suspect there are other domains where this would be a problem too.

The request which might be expanded is that there is an option the origin be reset on the date change part of the TimeStamp and TimeStampTZ so that any mod errors do not exists or compound each day.

I have written around this by only doing daily requests with a CTE in the SQL , but I think there is a significant performance hit and certainly not getting the full benefit of the TimescaleDb extension.

Implementation challenges

No response

ryanbooz commented 2 years ago

@SuperDaveOsbourne - I recently created an enhancement request for our hyperfunctions team to create an aggregate specifically for financial data and the components needed to generate things like OHLCV aggregates. Would be interested to get your input/feedback there.

https://github.com/timescale/timescaledb-toolkit/issues/445