open-telemetry / weaver

OTel Weaver lets you easily develop, validate, document, and deploy semantic conventions
Apache License 2.0
59 stars 23 forks source link

Validate that Weaver is compatible with older versions of the registry #481

Open lquerel opened 5 days ago

lquerel commented 5 days ago

Currently, Weaver is compatible with registries from version 1.22 to the most recent one. However, there is no specific test dedicated to validating support for older versions. It would be beneficial to add a test modeled after the one that verifies the registry check command (see tests/registry_check.rs).

This test should execute the registry check command on each archive of the registries, starting from version 1.22 up to the latest. To avoid introducing external dependencies on content from GitHub in our tests, I propose copying these archives into a Weaver directory and performing the validation on these local archives (Weaver already supports this mode).

Of course, this means maintaining these archives within Weaver. If there’s a more dynamic way to achieve this without adding external dependencies, it should definitely be explored.

In any case, this test should fail if one or more registry checks fail. Ideally, the test should return the version numbers of all registries that could not be validated.

jerbly commented 4 days ago

What is the need for being backwards compatible?

My understanding is that the schema defined in https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md is versioned via git tags/releases. To avoid polluting the code with a lot of backwards compatibility logic and tests, perhaps the user should use the appropriate tag/release of weaver for the version of schema they need?

If we take this approach, there is already quite a bit of code and tests which could be removed rather than maintained.

lquerel commented 3 days ago

The main need for supporting multiple versions of the registry will become significantly more important when we support the multi-registry scenario (which is our next focus). In this context, it will be difficult to get everyone to agree on the schema versions of different registries. Let's say the beginning of multi-registry support corresponds to semconv registry 1.32. Subsequently, people start creating registries that extend 1.32 and others that extend 1.33. In this case, Weaver will need to be able to read these registries even if they follow different schemas.

jerbly commented 3 days ago

I see... Perhaps this task would be simpler and more accurate if the standard included a schema version declaration at the top of each yaml semconv file. Then weaver could use this to ensure compliance with that schema version. This would allow for clear branches in the parsing code.

Without an explicit schema version declaration the parser has to "guess" compatibility, the code becomes more tangled and the compliance is weaker. In effect you're checking compliance against a range of versions.