micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.5k stars 993 forks source link

Replace the OpenTelemetry protobuf library #5658

Open GFriedrich opened 2 weeks ago

GFriedrich commented 2 weeks ago

Please describe the feature request. Get rid of the OpenTelemetry protobuf library as it is not intended for public use.

Rationale Recently https://github.com/micrometer-metrics/micrometer/issues/5493 was opened to express that some vulnerable protobuf version is used. The ticket was closed though, because the protobuf version is coming from the OpenTelemetry protobuf library. Now though, the OpenTelemetry maintainers decided that this library is not intended for public use (see https://github.com/open-telemetry/opentelemetry-proto-java/pull/20). Therefore I guess the only option here is to get rid of this library and generating the files for protobuf by yourselves.

jonatan-ivanov commented 2 weeks ago

Thank you for the issue!

It seems CVE-2024-7254 is about parsing (untrusted) data. I don't think Micrometer is affected since it does not parse any protobuf payload (deserializing), it does the opposite, it produces protobuf payload (serializing). From security scanners perspective, this seems like a false-positive to me. It also worth to mention that other than being affected or not, from Micrometer's perspective, this should not be a CVE since users should be able to upgrade the protobuf dependency (protobuf-java:3.25.5 should be fixed).

We need to discuss internally how to handle the new opentelemetry-proto guidance. According to it:

[...] please generate your own bindings, consulting grpc codegen and possibly build.gradle.kts

and as you mentioned, we should be able to simply generate the code from the proto specs using grpc codegen. Unfortunately, this is not as simple as it sounds. I asked around the OTel Slack, it seems that:

Please feel free to share if you have any idea/comment/question/etc.

patpatpat123 commented 2 weeks ago

Hello team,

Just wanted to highlight something the developers of the library are saying: From the developers of opentelemetry-proto-java, it seems that (I am just quoting):

unfortunately it looks like this artifact io.micrometer:micrometer-registry-otlp didn't follow [our guidance](https://github.com/open-telemetry/opentelemetry-proto-java#support):

We have no intention of eventually publishing stable artifacts. If you need guarantees, please generate your own bindings, consulting [grpc codegen](https://grpc.io/docs/languages/java/generated-code/#codegen) and possibly [build.gradle.kts](https://github.com/open-telemetry/opentelemetry-proto-java/blob/main/build.gradle.kts)

I am afraid there is a gap between two teams here.

On one hand (proto java team) say the library is not ready for production use

On the other hand (micrometer team) might not know about this (recent) information.

If this is the case, is there a way to replace this library with something more "production ready"?

Please let us know, as OTLP is a widely used library.

Thank you and good day!

GFriedrich commented 2 weeks ago

Hey @jonatan-ivanov,

thanks for getting back! First of all it is good to know that the vulnerability can be excluded here. Thank you for the info. Nevertheless though I want to say it is a rather bad developer experience that the first thing you see when using the library is a warning about a vulnerability. That simply gives a bad belly feeling when using it. I do understand that a solution comes with some complexities and I'm not asking for getting rid of it from one day to the other. But I'm asking for that some planning is started on how to solve this issue, because keeping it that way is probably the worst solution.

From an external point of view I see two options:

In any case thanks so much for looking into this 🙏

jonatan-ivanov commented 2 weeks ago

@patpatpat123 As I mentioned in https://github.com/micrometer-metrics/micrometer/issues/5659#issuecomment-2472174512, I think it worths to note that it seems the readme was clarified yesterday(7f3d19c) and a note about this was added earlier this year (e2efe5f). Before that, the readme only contained how to use the published artifact publicly available in Maven Central. We added OTLP support back in 2022 where there was no sign that this artifact should not be used. As of today, Maven Central does not have any note that users should not use these artifacts, neither the javadoc of the published sources. The artifact name does not indicate that it is not intended for eventual production use either and/or it should be only used by OTel test suites.

@GFriedrich My comment above about the CVE is only about the context (Micrometer should not be affected) and how to fix it if you need it to be fixed right now. I think we should fix this even if Micrometer is seemingly not affected by the CVE, the discussion has already started, we are looking into this. To improve the dev experience, could you please report this to your security scanner (it should not report this, it is false-positive)?

Either you decide together with OTel that it probably makes sense to have some Java artifact with the protobuf Java classes

This is mostly an OTel decision. I think a binary distribution (jar in case of Java) of the OTLP spec should be available for everyone. I already started a discussion with otel-java folks, let's see where this will go.

trask commented 2 weeks ago

@jonatan-ivanov I don't think this is a fair statement. The artifact version is (and has always been) tagged with -alpha. That is very standard way to indicate that an artifact is not indented for production use yet, including in semantic versioning "A pre-release version indicates that the version is unstable".

jonatan-ivanov commented 2 weeks ago

Hey @trask, there are a lot of things going on in the comments, could you please let me know which statement is not fair so that I can correct it?

If I need to guess you mean the one about production use ("not intended for production use")? I think I should have written "eventual production use"(?). I edited it and added a note about the OTel test usage to make it clearer, but that's what I meant (since that was changed in the commits I linked) not the -alpha suffix. That's what I was talking about in the previous sentence too ("this artifact should not be used").

Right now, the repo says there is no intention of eventually publishing stable artifacts. To me this means that it is not just not indented for production use yet but ever. I think this is a completely different topic than the -alpha suffix in the version number especially if I consider OTel artifacts that have ~"stable parts" and also have the -alpha suffix.

jsuereth commented 2 weeks ago

We https://github.com/micrometer-metrics/micrometer/pull/3129 back in 2022 where there was no sign that this artifact should not be used.

I believe this is where the -alpha version comment comes from.

Wanted to say something else though:

A big motivator in opentelemetry for its proto solution was actually increased performance directly serializing the intermediate storage classes to proto vs. having to memory-map/convert to the generated classes and then serialize. That is:

If micrometer is able to optimise using the same technique, it might be beneficial to everyone. I'm a bit overloaded, but could explain the basics on how this works if Micrometer wanted to create serialize-only templates from proto descriptions, a simplification of otel's code here: https://github.com/open-telemetry/opentelemetry-java/blob/main/buildSrc/src/main/kotlin/io/opentelemetry/gradle/ProtoFieldsWireHandler.kt

patpatpat123 commented 2 weeks ago

Hey @jonatan-ivanov ,

Thank you for following up on this. Would it make more sense to devise a strategy to decouple from its usage instead of trying to justify past, current, and future use of the library?

I think the micrometer OTLP community would benefit from that.

jack-berg commented 2 weeks ago

Any reason why micrometer can't just use io.opentelemetry:opentelemetry-exporter-otlp?

Micrometer could provide a mapping to the MetricData interface, and call OtlpHttpMetricExporter#export(Collection<MetricData>).

This would allow micrometer to leverage the high performance, hand-rolled serialization we use internally in opentelemetry-exporter-otlp, without having to worry about depending on io.opentelemetry.proto:opentelemetry-proto.

The only argument I can think of to use io.opentelemetry.proto:opentelemetry-proto instead of io.opentelemetry:opentelemetry-exporter-otlp, is that io.opentelemetry:opentelemetry-exporter-otlp has opinions about what HTTP client libraries can be used as the transport (OkHttp and JDK 11+ HttpClient). If micrometer really needs to support some other HTTP client library (I would challenge this but ultimately not my decision), then micrometer can generate their own proto bindings fairly simply as is done in opentelemetry-proto-java.

jonatan-ivanov commented 1 week ago

@jsuereth Ah, I see, by saying "there was no sign that this artifact should not be used", what I meant here was no sign that it should not be used at all (other than OTel test suits). To me this is a different topic than the -alpha suffix since right now, the repo says there is no intention of eventually publishing stable artifacts.

Thank you for the details about proto and the offer to help. As a short term solution, it seems there will be another release of proto-java that contains the CVE fix since it seems there will be some changes in the proto specs. As a long term solution, I'm planning to attend on the Spec SIG next week to see what can be done across the portfolio and if we can have a stable, easily consumable "binary" distribution of the OTLP types (does not need to be proto-generated but could be tested by proto-generated code, like what the SDK does with those types right now). Let's see where that discussion goes and if the optimizations you mentioned can be reused there, I'm also interested discussing them with you from Micrometer perspective.

jonatan-ivanov commented 1 week ago

@jack-berg I think that could be something we should look into. In Micrometer registries, users can inject their own http client (there is an interface for it) so the client aspect you mentioned is a valid concern. Also ThreadFactory but that might not be a concern. Dependencies could be also an interesting question, bringing in the whole OTel SDK in order to serialize OTLP is something I like to avoid.

jack-berg commented 1 week ago

Yup good points @jonatan-ivanov. There seems to be a tradeoff:

From a dependency size standpoint, we have: