dgryski / go-tsz

Time series compression algorithm from Facebook's Gorilla paper
BSD 2-Clause "Simplified" License
538 stars 67 forks source link

Finish() requirement #1

Closed Dieterbe closed 9 years ago

Dieterbe commented 9 years ago

one thing i noticed in this implementation as that we are expected to call series.Finish() before iterating on it. (otherwise the tests fail)

the paper mentions

Concurrency is attained with a (...) 1-byte spin lock on each time series. Since each individual time series has a relatively low write throughput, the spin lock has very low contention between reads and writes.

any plans on supporting concurrent reading and writing of un-finished streams? a spinlock or mutex aside, i'm mostly curious what the requirements would be to support this. i don't understand the bitstream code yet, but could Next() not read all values and then somehow detect it's at the end of the stream?

thanks

dgryski commented 9 years ago

I meant to get around to that. I did this as part of a corporate hackathon (2 days), and didn't finish it up. Next one 24th/25th, but I'll accept pull requests in the mean-time ;)

Easiest way to support reading of partial streams it to store also the highest time-stamp seen so we know when to terminate the iterator when we reach that time.

Dieterbe commented 9 years ago

fixed via fbd173d292b7b5de4e454ab1032a61051cb500d8