open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.94k stars 1.32k forks source link

[chore][cmd/builder] Test for unreleased versions #10030

Closed evan-bradley closed 1 week ago

evan-bradley commented 3 weeks ago

Description

Add a test to help prevent our replace statements from going out of date and causing situations like https://github.com/open-telemetry/opentelemetry-collector/issues/10014. This is also probably just a decent test case to have since it's a syntactically valid but nonexistent version.

codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

:exclamation: No coverage uploaded for pull request base (main@5c72c5d). Click here to learn what that means.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #10030 +/- ## ======================================= Coverage ? 91.68% ======================================= Files ? 363 Lines ? 16770 Branches ? 0 ======================================= Hits ? 15375 Misses ? 1056 Partials ? 339 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

evan-bradley commented 3 weeks ago

Would this test fail in a certain revision of this repo?

This test should only fail if our replace statements in the tests go out of date. It protects against failures in other tests that would occur at the first stage in the release process when the version is bumped when there are certain changes in the new release.

Here's the failure scenario:

  1. We are on version X, and make changes to the builder to include new modules that only goes into effect once version X+1 is released. This is necessary for when we make Collector API changes and want to maintain backwards compatibility with versions <X+1. We gate the changes by performing checks within the builder for the configured version of the Collector APIs to be used in the resulting binary.
  2. The release process kicks off, and everything is bumped to version X+1, so the builder now includes the new modules in the generated files. However, these modules are not publicly available, so attempts to grab them with go get in the builder tests will fail unless we have the necessary replace statements in place.

The goal of this test is to ensure that the replace statements are up-to-date by forcing this failure scenario. If any modules are missing, the test will fail because Go will attempt to grab the nonexistent versions. The statements can then be updated during the PR for the builder changes instead of in the middle of a release.

jpkrohling commented 3 weeks ago

The goal of this test is to ensure that the replace statements are up-to-date by forcing this failure scenario

Sounds good, but let me phrase the previous question differently: are we confident that this test will indeed fail on that scenario?

evan-bradley commented 3 weeks ago

Sorry, I had meant to rewrite the test yesterday but was not able to get around to it. The prior version of the test would catch most cases, but not all, so thanks for keeping us on track. I've updated the test to directly check the go.mod file for modules that are missing from our replace statements, so I feel confident now that we don't have any gaps so long as the whole module chain is represented within the go.mod file.

Unfortunately it is still possible to run into errors if the replace statements aren't used in a test, but this at least ensures the list is up to date.

evan-bradley commented 3 weeks ago

I was confused about this test, until I realized that this is testing the release process for Core/Contrib, while the builder is used beyond those two distributions.

This is specifically to aid in releasing the builder. It's only required because the builder is released at the same time all Core modules are released, if we released the builder after Core/Contrib we wouldn't need this.

We need to check that the go.mod files that other tests in main_test.go may generate don't use any modules that don't have associated replace statements, to prevent the tests from failing because they want a version that isn't yet available. We will usually only hit these failures during a release process, and that would be tedious to manually test for, so this test should automate that by checking we didn't miss anything.

I'll revise the comment on the test to be a little more concise. Unfortunately this is a strange edge case that we hopefully won't hit often, so it's difficult to quickly explain.