GoogleCloudPlatform / opentelemetry-operations-java

Apache License 2.0
71 stars 38 forks source link

Contribution of the resource providers to opentelemetry-java-contrib repo #266

Closed SylvainJuge closed 7 months ago

SylvainJuge commented 11 months ago

Hi !

I just opened https://github.com/open-telemetry/opentelemetry-java-contrib/issues/1074 in the java contrib repo after discussing the topic in the Java SIG meeting.

Ideally we'd like to have the most common cloud resource provider implementations available, would you be interested in contributing your implementation currently in detectors/resources to the open-telemetry/opentelemetry-java-contrib repo ?

If yes, would you also be interested in being maintainers of this component within otel ? That does not sound like a very strong commitment and you'll benefit from frequent opentelemetry release schedule and automatic version upgrades. I can definitely be a second co-maintainer of this component within otel if needed.

I saw that this project currently relies on the now obsolete io.opentelemetry:opentelemetry-sdk-extension-resources:1.19.0 dependency, which has now been moved to opentelemetry-java-contrib repository, as a consequence replacing this will already create a dependency to the contrib repo. What I mean here is that ideally a single implementation in the contrib repo could easily be reused here without adding an extra repository as a dependency (just an extra artifact).

SylvainJuge commented 10 months ago

For now I see two potential aspects that could be problematic for re-use in OTel-based agents or SDKs:

dashpole commented 9 months ago

Hey @SylvainJuge, thanks for reaching out. Overall, we would be happy to contribute the resource detector here to the java contrib repo.

One of our original reasons for having it in this repository was to be able to run presubmit integration tests on GCP infrastructure to ensure the resource detectors work as expected. For other languages, like golang, we ended up splitting the detector into a core detection library (https://github.com/GoogleCloudPlatform/opentelemetry-operations-go/tree/main/detectors/gcp), and a contrib entrypoint (https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/detectors/gcp/detector.go). We can run integration tests on the core library to ensure the detection methods work properly, but the actual detector would live in the upstream contrib repository. Would something like that work for you?

From our side, @psx95 and @jsuereth would be code owners for the contrib detector, and @psx95 will work on the new detector in contrib, including working through the issues you mention.

SylvainJuge commented 9 months ago

I see, having integration test on your side definitely makes sense to ensure everything is working as expected in the target environment and to keep a short feedback loop.

If I understand correctly the plan you suggest involves:

From the perspective of a consumer of this GCP resource detector:

* : for the AWS equivalent it used to use the complete AWS SDK and is now rewritten with a few HTTP queries.

Another alternative could be to keep everything in one "library+resource detector" in the contrib repo:

But that second option would likely introduce more friction than necessary on your side if you need to change anything the library, thus the first is probably better here.

Ideally, without going too deep into the implementation details, the "library" could be just about parsing the metadata, without any dependencies. That would help for testing without HTTP communication and using it with another HTTP client (see https://github.com/open-telemetry/opentelemetry-java-contrib/pull/1061 for context).

dashpole commented 9 months ago

As a first step, @psx95 can work on a PR to extract the library into its own package here. Then, we can take a look at the dependencies, and see if it works, or if we need to explore other options.

The list of dependencies seems reasonable (at least no GCP SDKs, or anything), so hopefully it shouldn't be too hard:

https://github.com/GoogleCloudPlatform/opentelemetry-operations-java/blob/25e1b8868d3a9226ff432907aba25fc9ee90c4d9/detectors/resources/build.gradle#L18-L29

psx95 commented 8 months ago

I have a draft PR - https://github.com/GoogleCloudPlatform/opentelemetry-operations-java/pull/276 which refactors the platform detection logic from the resource providers and allows resource provider to rely on the library methods to retrieve the values for the attributes. (This is in-line with what our Go resource detector does)

However, which attributes to add to a particular resource after it has been detected would still be specified in the resource provider (proposed to go in contrib repo). Just wanted to call out the following effects of this approach -

  1. The support library does not need to take dependency on OTel APIs, which means the upstream library can be kept up-to-date with latest APIs without being hindered by the need to update support library dependencies.
  2. Our integration tests passing relies on correct detection logic (support lib) and correct attributes being present (upstream contrib repo).

Let me know if this approach works, and I can get toward finalizing the draft PR (notably, the unit tests are currently missing).

psx95 commented 8 months ago

@SylvainJuge would you mind taking a look at #276 - it splits our resource detection into 2 modules -

  1. support library: detectors/resources-support
  2. upstream detector: detectors/resources

Let us know if this split works for the upstream contribution.

SylvainJuge commented 8 months ago

Hi @psx95, thanks for making progress on this!

Once this is merged and released, we can already reuse the otel resource detector as-is in any opentelemetry distribution so this is great.

The next logical step could be to contribute the detector directly into the opentelemetry-contrib-java repo, this contribution would then have a dependency on the support library here.

Do you have ideas how to implement it ?

Maybe instead of a direct contribution of the detector to the upstream contrib repo, I think it could be possible to have a direct dependency on the otel detector in this repo, hence allowing reuse without any code duplication. With that approach, everything would remain in this repository and that would be simpler. In a sense the upstream contrib module would just define a dependency to a released version of this repository and the maintenance burden would be minimal.

Also, from the perspective of the opentelemetry users, having a split between the otel detector and the resource library could be useful but not strictly requires as we would always have a direct dependency on the detector.

psx95 commented 8 months ago

Hi @SylvainJuge, thank you for the review!

I was thinking of contributing the detector-resources module directly upstream. I was thinking this because -

  1. Contributing the detector-resources upstream would enable the module to be benefitted by the regular updates to the OTel dependencies in the project. Any breaking change(s) or use of obsolete OTel libraries would be caught & fixed promptly.
  2. It will be in-line with what we do with our go resource detector. (Not a major reason, but helps to keep things uniform IMO)

IIUC, the alternative you are suggesting is to add implementation in the upstream repo, which delegates all the logic/handling to the module in this dependency via a hard dependency (it would also be the only dependency). In this case, I feel that there could be potential discrepancies b/w the versions of OTel dependencies used in contrib, vs the module which would be maintained here.

cc: @dashpole @jsuereth

SylvainJuge commented 8 months ago

I agree with you @psx95 , the approach you are suggesting seems the most simple here.

However I wonder from your perspective how would the maintenance of the support library work, for example there is currently no direct consumer of this library without detector-resources, including tests.

What this means is that you will likely have to choose between:

psx95 commented 8 months ago

Hi @SylvainJuge, I was thinking something along the lines of your first suggestion - using a dependency on the upstream contrib-repo, while replacing the transitive dependency on support library with the local version - to ensure testing on the local changes made in the support library.

This would help us catch any issues that arise due to any change within the GCP metadata servers and we can promptly update the support lib.

Since the upstream repo now only takes care of the ResourceAttributes mapping, I feel that having proper unit tests with mocks to ensure correct mapping should provide us with good enough coverage.

psx95 commented 7 months ago

Created a PR in upstream open-telemetry java contrib repo adding the GCP Resource providers - https://github.com/open-telemetry/opentelemetry-java-contrib/pull/1162

cc- @dashpole, @SylvainJuge

psx95 commented 7 months ago

The linked PR is now merged upstream. Closing this issue as completed.