tikv / rust-prometheus

Prometheus instrumentation library for Rust applications
Apache License 2.0
1.07k stars 182 forks source link

Switch from protobuf to prost #387

Open Folyd opened 3 years ago

Folyd commented 3 years ago

Related issue: #384

hdost commented 3 years ago

Looks good, but you may need to rebase.

hdost commented 3 years ago

@breeswish I think it's in your hands now 😅

Folyd commented 3 years ago

Hi @mxinden. Thanks for reviewing.

Introduce an intermediate data representation like suggested in the issue which represents all constraints in the data model itself, thus checked at compile time. This intermediary data representation would be created through Collector::collect and then be used in encoder/pb.rs and encoder/text.rs with the former going from intermediate data representation to the current representation generated by post.

I'm afraid I'm not 100% get what the intermediary data representation means and how it works. Could you please give more detail? Thanks a lot. 😸

mxinden commented 3 years ago

Hi @mxinden. Thanks for reviewing.

Introduce an intermediate data representation like suggested in the issue which represents all constraints in the data model itself, thus checked at compile time. This intermediary data representation would be created through Collector::collect and then be used in encoder/pb.rs and encoder/text.rs with the former going from intermediate data representation to the current representation generated by post.

I'm afraid I'm not 100% get what the intermediary data representation means and how it works. Could you please give more detail? Thanks a lot. smile_cat

The below summarizes the data flow used on master as well as this pull request:

  1. Each metric (e.g. GenericCounter) tracks its state internally in some non-consistent memory format.
  2. On call to Collector::collect each metric returns its data in the uniform structure Vec<proto::MetricFamily.
  3. The Vec<proto::MetricFamily> is passed to Encoder::encode.
    1. The TextEncoder iterates the Vec<proto::MetricFamily>, encodes each and writes the corresponding bytes out to the writer.
    2. The ProtobufEncoder can work with proto::MetricFamily directly, encoding it into the writer.

The proto::MetricFamily data layout differs between protobuf and prost. Namely that the latter makes extensive use of Option. As you noticed in text.rs handling the case of None for each property is cumbersome and error prone, we loose the guarantees that the Rust type system gives us.

My suggestion is to introduce our own data layout which each metric type would return in their implementation of Collector::collect (step 2). This layout would enforce all Prometheus format constraints at compile time (e.g. each MetricFamily having a name thus name: String and not name: Option<String>). TextEncoder could directly work with this format. ProtobufEncoder would need to transform it into the proto::MetricFamily format, which is easy as e.g. going from String to Option<String> requires no handling of the None case.

Does the above make sense to you @Folyd? If so, what do you think?