Open srikanthccv opened 2 years ago
I think it would be better to separate our semantic conventions code in a different repo so that we have an opentelemetry-semantic-conventions
package that follows the releases of the spec because:
pip freeze
to find out how was the running environment of an application, they don't expect to also have to record some application code line to find out which semantic conventions version was being used.Something else:
from opentelemetry.semantic_conventions.v_1_2_3 import X
today with opentelemetry-semantic-conventions==5.2.3
in their requirements file.X
from opentelemetry.semantic_conventions.v_1_2_3
. That commit gets merged and included in release 5.2.4
of our packages (including opentelemetry-semantic-conventions
).opentelemetry-semantic-conventions==5.2.4
. The user still wants to use semantic conventions v1.2.3 to import X
but now it is broken.In summary, using actual released packages with their corresponding versions to protects the integrity of those versions because it is guaranteed that the code in a specific version of a package will never change. Using directories named after versions is a fragile approach.
I like this idea of releasing opentelemetry-semantic-conventions
when a new version of the spec is released. I do see some benefits to @srikanthccv's proposal in the following scenario:
I'm not sure if this is a great situation, as it may mean that with multiple semconvs being used applications will produce telemetry differently, this should be ok so long as schema URLs are set correctly.
I do struggle with the idea of having a continuously increasing package size. If this is the route the community wants to take, I'd recommend placing some kind of limit for how many versions back would be supported.
There is precedent for including multiple versions within a single package, the kubernetes client package does this (https://github.com/kubernetes-client/python/tree/master/kubernetes/client/api).
I think it would be better to separate our semantic conventions code in a different repo so that we have an opentelemetry-semantic-conventions package that follows the releases of the spec because
I worry this sets an incorrect expectation that other SDK packages also follow and keep in sync with the spec release. Does any other SIG do this today?
If users want to use another version of X, they update the version number of X in their requirements file, they don't update application or library code.
This assumes that a new semantic version didn't delete anything which is not the case now. Take a look at the recent release, there are some attributes removed from the HTTP attributes which means the user might have to update the application code if they want to use the new version. In a sense, we are also the users (contrib repo), if we want to use the 1.13.0 version we must update the instrumentation library code.
Accidentally someone commits the deletion the symbol X from opentelemetry.semantic_conventions.v_1_2_3. That commit gets merged and included in release 5.2.4 of our packages (including opentelemetry-semantic-conventions).
No handmade changes will be done to this package. It's all generated from sem conv yaml files.
I do struggle with the idea of having a continuously increasing package size. If this is the route the community wants to take, I'd recommend placing some kind of limit for how many versions back would be supported.
This is a valid concern and I think the idea of limiting the number of versions is a good suggestion.
This assumes that a new semantic version didn't delete anything which is not the case now. Take a look at the recent release, there are some attributes removed from the HTTP attributes which means the user might have to update the application code if they want to use the new version. In a sense, we are also the users (contrib repo), if we want to use the 1.13.0 version we must update the instrumentation library code.
The application code may need or not to be updated if another version of the semantic conventions are used but that is not the point here. The point here is that when using version-named directories a dependency version is being specified outside the requirements file and into the application code.
No handmade changes will be done to this package. It's all generated from sem conv yaml files.
That does not matter, with the version-named directories approach that generated code can be modified by any commit.
This is a valid concern and I think the idea of limiting the number of versions is a good suggestion.
Limiting the number of versions is a solution that introduces another problem, an user will not be able to use an old version after it is deleted. Having a separate package for the semantic conventions does not have the problem of code growing and also it does not have the problem of having to delete old versions.
Isn't the idea behind schema_url
about the ability for applications to use multiple sematic conv versions? How would it be possible if there is a separate package for each version?
@ocelotl
I'm not following your suggestion here. Are you proposing to release a new differently named package per semantic convention version? i.e. opentelemetry-semantic-conventions-1.12
for v1.12 of semantic conventions and also keep it in lockstep with the spec version? How would the instrumentation hard dependencies work in this case?
@srikanthccv @ocelotl 's concern of:
Limiting the number of versions is a solution that introduces another problem, an user will not be able to use an old version after it is deleted.
makes sense. I can totally see users setting up their application once and never updating the semantic conventions again as soon as it fits their business use case. If these conventions get outdated/obsolete this could be troublesome. However, as to @codeboten 's point,
If this is the route the community wants to take, I'd recommend placing some kind of limit for how many versions back would be supported.
maybe the above problem isn't as bad as we think realistically for users? Maybe the desire to have a stamped "stable" on our product is more appealing to users than it is to update their code once in a blue moon? As long as we are transparent with our users and give them ample time/warning to upgrade (before deleting older versions of semconv), is that so bad?
@ocelotl I'm not following your suggestion here. Are you proposing to release a new differently named package per semantic convention version? i.e.
opentelemetry-semantic-conventions-1.12
for v1.12 of semantic conventions and also keep it in lockstep with the spec version? How would the instrumentation hard dependencies work in this case?
No, I am suggesting to have our opentelemetry-semantic-conventions
package follow this versioning scheme: X.Y.Z.a
, where X.Y.X
is the version number of the spec and a
is an additional number for our bugfixes.
@ocelotl
If users want to use another version of X, they update the version number of X in their requirements file, they don't update application or library code.
Isn't this what we already do today? How would users be able to use semantic version X without having to update their instrumentation version as well?
Maybe I'm not fully understanding. Feel free to add this topic to Thursday's agenda.
Maybe I'm not fully understanding. Feel free to add this topic to Thursday's agenda.
Sure, I will :+1:
In the SIG we discussed that the problem with the approach I suggest is that we expect that it will be frequent that an instrumentation will pin its dependency to a version of semantic conventions and another one to another version so that the user will have a problem when trying to install both in the same virtual environment (I specifically mention frequent here because the same can happen to other packages like opentelemetry-api
, opentelemetry-sdk
, etc. but we don't expect that to happen frequently).
There is another solution that has all the advantages of the approach I suggest without this problem, and that is to make a new package named opentelemetry-semantic-conventions-...
with every spec release, where ...
is some versioning schema that indicates that semantic conventions is unstable and that the package corresponds with the released version of the spec (something like 0.A.B.C.D
, where A.B.C
is the spec version and D
a bug fix number, maybe?).
Another advantage of this approach is that if it happens that the semantic conventions in the spec someday become a stable thing that follows semantic versioning we can just start having the release of opentelemetry-semantic-conventions
with a normal semantic versioning.
The disadvantage of this approach is that is horrible.
But it works. :shrug:
WDYT @srikanthccv @lzchen @aabmass ?
👋 New here, so feel free to ignore this... there is lots of context I don't have!
FWIW, I have used the "subpackage-versioning" strategy
from foo.v1_2_3 import bar
within private packages to cushion against version conflicts while providing backward compatibility. It isn't the strangest thing I've seen, although I agree it also isn't the most straight-forward. All told, the go
team's approach linked in the issue description does seem to make sense given the forces at play.
I think creating an entire new package for every version has a few downsides:
@ocelotl
There is another solution that has all the advantages of the approach I suggest without this problem, and that is to make a new package named opentelemetry-semantic-conventions-... with every spec release
Are you proposing to release a new differently named package per semantic convention version?
So is your new proposal aligned with my question from before?
wave New here, so feel free to ignore this... there is lots of context I don't have!
FWIW, I have used the "subpackage-versioning" strategy
from foo.v1_2_3 import bar
within private packages to cushion against version conflicts while providing backward compatibility. It isn't the strangest thing I've seen, although I agree it also isn't the most straight-forward. All told, the
go
team's approach linked in the issue description does seem to make sense given the forces at play.I think creating an entire new package for every version has a few downsides:
- Need to add entirely new dependency for every update
This is normal, everytime an user wants to update a library the will add a change in their requirements.txt
file.
- Still need to update library/app code if you want to move to newer version
This is true.
Clutters the package namespace
- Makes finding the right package more difficult/more typo-squatting options
This is true.
- PyPI is not designed for sorting/comparing across packages in this way/more difficult to know what "latest" is
This is true as well. I am proposing a horrible solution, indeed.
@ocelotl
There is another solution that has all the advantages of the approach I suggest without this problem, and that is to make a new package named opentelemetry-semantic-conventions-... with every spec release
Are you proposing to release a new differently named package per semantic convention version?
So is your new proposal aligned with my question from before?
That's right, it is.
I do struggle with the idea of having a continuously increasing package size. If this is the route the community wants to take, I'd recommend placing some kind of limit for how many versions back would be supported.
@codeboten I agree that would be nice. I see two way to do this without breaking semantic versioning
opentelemetry-semantic-conventions
as <1.0, doing minor version releases when we remove something and otherwise patch versions.Also re package size, I did a quick test by copying semantic conventions 200 times in src tree to get some rough estimates:
src/
went from 116K -> 16M28K dist/opentelemetry_semantic_conventions-0.34b0-py3-none-any.whl
24K dist/opentelemetry_semantic_conventions-0.34b0.tar.gz
to
3.8M dist/opentelemetry_semantic_conventions-0.34b0-py3-none-any.whl
3.3M dist/opentelemetry_semantic_conventions-0.34b0.tar.gz
This is similar to grpcio
wheel package size for comparison. But not great.
This could be an alternate solution https://github.com/open-telemetry/semantic-conventions/issues/214 to avoid backward compatibility issues when fields are removed.
Do we still want to explore this now that we have agreed that we will be carrying out the migration plan for OT sem conv? https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md
To refocus this conversation, the goal of this issue is to mark opentelemetry-semantic-conventions PyPI package as stable with no expectation of every having a major version bump. The benefits are:
opentelemetry-api~=1.0 opentelemetry-semantic-conventions~=1.0
package. The OpenTelemetry dependencies will never cause Flask and requests to conflict as pip may choose the newest version of opentelemetry-semantic-conventionsOTEL_SEMCONV_STABILITY_OPT_IN
, as they can use the constants from any spec version.
The idea is to create a new folder for each version of semantic conventions release and instrumentation packages or end users can choose which one to use.
The folder structure will look like following
It is convenient to use this because we don't necessarily need to update the instrumentation every time there is a spec release. This will enable us to mark the package as stable since there will always be a backward compatible sub-package but the downside is it can grow big over time. The current size of the package seems to be around 26kb https://pypi.org/project/opentelemetry-semantic-conventions/#files. Original inspiration from go SKD https://github.com/open-telemetry/opentelemetry-go/tree/main/semconv