spandex-project / spandex

A platform agnostic tracing library
MIT License
335 stars 53 forks source link

Remove Optimal #124

Open kamilkowalski opened 3 years ago

kamilkowalski commented 3 years ago

Overview

Fixes #100 by removing opts validation entirely, which is my preferred solution to the Optimal performance problem. Inspecting other tracing libraries for dynamically typed languages, I didn't see any performing validation of options - it always carries a performance penalty and that is something we should avoid in a tracing library.

That said, I'm opening this as a draft - it's quite a change and it is breaking. I'm open to discussing other options.

Why not use nimble_options?

An alternative to this approach would be to use something like dashbitco/nimble_options as suggested by @zachdaniel, but I still think that's a performance penalty added to every trace I'd like us to avoid.

If Spandex's interface was more conservative and didn't allow for modifying every option of the tracer when starting a trace/span, then we could allow passing most options only on tracer initialization, validate them on application startup, then skip validation of the limited set of options passed when creating a trace/span. That's an option I think could work.

Why not make optimal optional?

Another option mentioned by @GregMefford was to make optimal optional, and if it's detected - enable opts validation. The reason for this would be to help developers during test/dev to configure everything appropriately, then remove optimal to ensure maximum performance.

That's a practice I haven't seen previously, and I don't think it would be widely used - after all, one could as well simply test the deployment and its integration with Datadog. At our company we don't even enable Spandex in dev/test - we only do that in a test deployment.

zachdaniel commented 3 years ago

My only hesitation about fully removing option validation is how many people already have trouble knowing if the values they provide are valid, so removing more validation would likely exacerbate that issue.

However, I think that you're right and that we should just remove this. What should be added, I think, is a testing api adapter for Datadog that runs synchronously and validates traces, raising errors if it would send something invalid (e.g no service, invalid metadata) is provided. That has the added benefit of not having to disable the tracer in test, but instead just configuring your adapter. Seems like it would be the best of both worlds.

zachdaniel commented 3 years ago

Not suggesting you should add that as a part of this effort, just that we should do that eventually to solve the general problem.

kamilkowalski commented 3 years ago

Such a neat idea with the adapter! :clap:

kamilkowalski commented 3 years ago

If we're all for it, I'll test this change with one of our services and open for review once it's confirmed working.

kamilkowalski commented 3 years ago

@zachdaniel thanks for your patience - I didn't manage to test the change before going on holiday, but I've plugged it in today on a test environment at Fresha and everything seems to be working fine - no regressions, and I see traces being reported to Datadog successfully.

That said, I've encountered one problem - spandex_phoenix and spandex_datadog both depend on optimal without requiring it in deps - so Optimal was available only as a transitive dependency of spandex. Should we release this change to Spandex as a minor version bump, it will break spandex_phoenix and spandex_datadog - it's what happened to me before I switched both libraries to a version that didn't use Optimal.

Anyway, I'm opening this as a regular pull request since I've verified it to be working - what to do with the change is another matter that I'd appreciate your opinion on.

zachdaniel commented 3 years ago

@GregMefford should we wait for a major version bump for this then?

GregMefford commented 3 years ago

I suppose we could just release a major version whenever we want to. I need to find some time to spend thinking about and trying out the code in this PR to understand what the impact would be for an actual user... but if it works exactly the same except it throws less errors, that doesn't feel breaking to me. If you just have errors getting thrown and now they get thrown in a different way because we stop validating errors... their code already wouldn't have worked so how much effort do we need to invest to make it continue to break in exactly the same way. 🤔

zachdaniel commented 3 years ago

The main issue is that they would update spandex and nothing would tell them that they need to update spandex_datadog. But they do, because by updating to spandex they no longer have anything in their mix.exs file that depends on optimal, but spandex_datadog still calls to Optimal

GregMefford commented 3 years ago

Oh that's a good call-out I wasn't aware of. spandex_datadog should directly depend on optimal if it does in fact depend on it...

GregMefford commented 3 years ago

We could release a spandex_datadog version that either removes optimal or declares the dependency though, and then release a spandex version that depends on at least that version of spandex_datadog.

zachdaniel commented 3 years ago

Hmm....can you say "if you have this dependency, it must be more than x"? I guess thats just {:spandex_data, ">= x.x.x", optional: true} right?

zachdaniel commented 3 years ago

If so then I'm down with that plan 👍

GregMefford commented 3 years ago

Yes, that's how optional deps work. 👍 So that's one way to solve it. It's also possible to say {:spandex_datadog, "~> 1.2 or ~> 2.0"} if you want to allow either 2.x or 1.y where y must be at least 2, so 1.1.0 would not be allowed.

kamilkowalski commented 3 years ago

I'm wondering whether that's the correct approach because adding spandex_datadog to spandex's deps would look like a circular dependency - wouldn't dependency resolution complain about it? Also, spandex_datadog wouldn't be used anywhere in spandex so it's not really a dependency - the usual case for optional deps that I know of is when a library extends its functionality if a given dependency is present, which isn't the case here. Do you know of a precedent where optional deps are used in the way you're suggesting?

zachdaniel commented 3 years ago

If you try using it yourself you'd see the problem. spandex_datadog has a call to Optimal but doesn't include Optimal in its deps. So if you just upgrade spandex which removes optimal, your application would not compile. (You'd also probably have to mix deps.unlock) something

GregMefford commented 3 years ago

Oh ok I'm tracking then. That's a bug in spandex_datadog though, so I agree that we should avoid a circular dependency to solve that problem.

zachdaniel commented 3 years ago

The only issue is that, without the optional dependency on spandex_datadog, nothing will really tell anyone that they need to upgrade, it will just fail to compile and then they'll have to figure it out from there. We could potentially publish a new version of spandex_datadog that doesn't use optimal (or that does and adds it to its deps for now) and then retire the current version. https://hexdocs.pm/hex/Mix.Tasks.Hex.Retire.html

That should produce a warning that people can read and we can add a message so we could tell them what to do.

kamilkowalski commented 3 years ago

@zachdaniel retiring might be a good compromise. In any case, we first need to ship spandex_datadog and spandex_phoenix without Optimal - I'll get to those PRs first (I have the code ready too).

hkrutzer commented 3 years ago

How is this PR doing? 🙂

dynaum commented 2 years ago

@kamilkowalski would be amazing to add the service_version here too! 😍

koudelka commented 2 years ago

i'd also love to see this merged ;_;