Indexer-less measure processing in metricd

gnocchixyz / gnocchi

Timeseries database

Apache License 2.0

299 stars 85 forks source link

Indexer-less measure processing in metricd #548

Open jd opened 6 years ago

jd commented 6 years ago

Currently, metricd MetricProcessor needs an indexer connection for the only reason of accessing archive policy information for each metric.

It seems that metricd could not use an indexer connection at all if the archive policy content was duplicated to the metric in the aggregate storage.

This has other implication, especially for metrics using archive policies. However, in a world where archive policies would be gone (check https://github.com/gnocchixyz/gnocchi/issues/547), that might be feasible.

chungg commented 6 years ago

this seems like a potentially useful performance optimisation or just fixing list_metrics in general.

when writing to local disk, list_metrics in process_new_measures can take up 1/3 of the runtime. this is probably less significant if storage is remote but 1/3 of the time seems strange since it's just listing a single metric... granted we're talking single digit millisecond.

attached is a profile pushing 300 datapoints to metric with low policy profile-process-new.zip

jd commented 6 years ago

There's also an upside to this, is the ability for metricd to still work even if the indexer is down.

chungg commented 6 years ago

did you have a plan for this?

add the archive policy name to each measures obj and cache the definition in each worker?
add the definition to the measures obj itself?

i guess in both cases it gets kinda funky when you start updating timespan of policy

jd commented 6 years ago

My think is rather to dump archive policy requirement before doing this, i.e. #547.

Once this one is done, each metric needs to store its own aggregation definition and the question is where to store it. If you store it in the storage engine, then you solve this issue.

I don't have more details for now so there might be blockers (?).