timescale / timescaledb-backfill

Backfill hypertable data from one timescale instance to another
Apache License 2.0
0 stars 0 forks source link

fix: --from including extra chunks #140

Closed alejandrodnm closed 1 year ago

alejandrodnm commented 1 year ago

When using from and until parameters to find chunks in the source, we added a WHERE statement to filter chunks based on these conditions:

Consider the chunks as [start-end) ranges:

[0-2)[2-4)

With `from=2`
from=2 = range_end_chunk_1=2
from=2 < range_end_chunk_2=4

The current statement from <= chunk.range_end incorrectly matches both chunks. The first chunk doesn't have a time value inside the backfill period because max(time) < range_end inside a chunk.

To fix this, we remove the equality from the second condition and match only when from is less than chunk.range_end.

alejandrodnm commented 1 year ago

I'll add a test to check that we are getting just one chunk

JamesGuthrie commented 1 year ago

Can you explain somewhere what it was that we got wrong in the previous implementation?

alejandrodnm commented 1 year ago

Can you explain somewhere what it was that we got wrong in the previous implementation?

done