ryandotsmith / l2met

Convert a formatted log stream into metrics
http://r.32k.io/l2met-introduction
GNU General Public License v3.0
284 stars 26 forks source link

Improvement: Only read data valid data from redis. #97

Closed ryandotsmith closed 11 years ago

ryandotsmith commented 11 years ago

Currently the l2met outlet will read a list of buckets ids from a partition set, select bucket ids for which their time has come, then put back the buckets who are not ripe. This filtering means that data comes across the wire, is filtered and then must be put back one-by-one. The filtering is not that big of a deal, it is the putting back that makes this situation not ideal.

This change modifies the partition data structure in Redis. When buckets are RPUSHed into Redis, the key (or Id) of that bucket is added to a set. They key which describes the set now includes a timestamp. E.g.

ts.partition.outlet.N

The dynamic parts of the key are ts and N. The UNIX timestamp (ts) indicates when the bucket will be ready for processing. N represents the partition to which the bucket belongs. The timestamp is computed by the following formulate:

readyAt = truncate(bucketTime + bucketResolution)

E.g.

bucketTime = 1376285311
bucketResolution = 60
readyAt = 1376285340

Finally, it is not clear that this technique will work in all situations. We are depending on our outlets to never miss a tick. That is, the outlets must process buckets that are ready potentially every second. More thought and testing is required before this patch can be trusted.