Closed Dieterbe closed 8 years ago
I can take a look at this tonight.
I just pushed a bunch of space and speed optimizations to https://github.com/dgryski/go-tsz; could probably use a bit more testing
thanks. looks good. been testing it, seems solid. planning to get this into prod soon.
btw do you think it would make sense to at some point have one "scrap byte space" in which we could store multiple tsz series? right now metric-tank's design has a struct instance for every single metric, with a go-tsz series and protected with a lock for concurrent access. i'm wondering if it would make sense to instead have 1 structure that deals with many (maybe all) series at all once, which may make updates and other operations that deal with many series at once more efficient. (like our chunk persister and garbage collector, currently iterates all metrics and looks at their chunks). but then memory fragmentation and allocation would be our concern. I'm happy leaving that task to the go runtime though, but something worth thinking about.
I don't think so. You're going to have to lock the structure in order to make changes to it. I tried to come up with a way to do it with only atomic instructions but couldn't get it to always be consistent. There's slightly to much state that needs to be kept in sync. Even if there was a single entity to manage them all, you'd still need locking at the series-level.
@dgryski could you help analyze this and hopefully make go-tsz more memory efficient? I've emailed you the binary and the profile file.
note that
web makeSlice
shows that this all comes via go-tsz as well.this profile is taken on a metric-tank holding 400k series, each series having 20 chunks (go-tsz Series) of 600s long. most have between 10 and 60 points per series. this works out because our monitoring says we have 210M points, so 210M/400k/20=26 points per go-tsz Series object on average.