Basic API Discussion - Githubissues

daschl commented 6 years ago

Since #1 went a bit off the rails towards API, let's continue here.

stefano-pogliani commented 6 years ago

Sorry, tested the links this time :-)

Generics: https://play.rust-lang.org/?gist=6b12fae8f43c613428e19e4d9e66dc00&version=stable
Associated items: https://play.rust-lang.org/?gist=4f11c196923fea26f0d117138ca3e21a&version=stable

daschl commented 6 years ago

Agreed, I think you are running into the classic rust issue "different trait impls in the same match". I've been bitten by this myself in the past on other projects - I asked in the rust channel for some advice and they came up with a couple of approaches:

box it all: https://play.rust-lang.org/?gist=9bb2cf1baba46b77eea22bd7a9e19807&version=stable
don't store it in a variable: https://play.rust-lang.org/?gist=029f3e5a61072314ced47e632b1028ce&version=stable
delegate into monomorphization: https://play.rust-lang.org/?gist=34dc97cb3d50611c955f97da5f5b424d&version=stable

I'm still not sure which would be the best approach, but I hope we can come up with an approach where we don't have to box it all the way down in the API itself, since then we are requiring all downstream implementations to perform heap allocations (especially since most of the time you only have one impl that you are going to use I think).

stefano-pogliani commented 6 years ago

I am still getting used to rust so I'm likely to be thinking of it in incorrect terms.

I think the "box it all the way down" was my solution. Would be nice to find a better one :smile:
Not storing it as a variable seems like a non-option: can we really build large systems without ever storing the tracer as a variable?
Is monomorphization not the same as never storing it as a variable?

I see your point about avoiding boxes to avoid heap allocations. My concern is on the other hand having to force every function/method/struct/trait that wants to interact with the tracer or pass a span/span context along to be generic.

I wonder if we can come up with an API based on associated types for applications that want to avid heap allocation and a boxing wrapper on it for applications that rather support runtime tracer selection. This seems to work but it is a bit convoluted: https://play.rust-lang.org/?gist=3262023bca24e6369d87a745a8f724bc&version=stable

Would something like this help us? Would we want to do something like this?

daschl commented 6 years ago

@stefano-pogliani yeah, one thing to keep in mind is: most of the time, I think people only want one tracer impl which is the one they use throughout their infrastructure. So I think we should optimize for the common case and then think about, maybe in the util package ? provide a boxed implementation of the tracer (or a similar solution) which helps with this case.

stefano-pogliani commented 6 years ago

I'm probably thinking of it from a too "opsy" point of view so I have a few questions to understand the goal better:

Who is the end user?
How many software form how many authors should traces span?

@stefano-pogliani yeah, one thing to keep in mind is: most of the time, I think people only want one tracer impl which is the one they use throughout their infrastructure.

How do (infrastructure?) software developers know what tracer will their user's stacks be using?

And my answers (which hopefully will help understand my point) are:

The end user is whoever looks at the tracing data of a group of systems spanning different languages.
A trace is an end-to-end view of an operation right? From user interface to database software. Forcing the selection of the tracer at compile time to leads to inter-operability nightmares: do I need to re-compile my web-server (say nginx), rust application and all dependencies (say https://rocket.rs/), and database software (say mysql) because each author prefers a different tracer by default?

I know I am being painfully fixated on this point (and I'm sorry for that) but the complexity of supporting runtime selection of the tracer in rustracing is the reason I have started my own project (which I would have avoided otherwise).

stefano-pogliani commented 6 years ago

I guess my point is: regardless of how we provide a runtime tracer selection, can this be a first-class citizen alongside the API itself (even in a opentracing-util crate) and not a second-hand feature forced on at the end of the API design process?

I'm happy to be responsible for keeping this wrapper layer (or whatever it is) up to date as the API evolves.

daschl commented 6 years ago

@stefano-pogliani totally agreed that we need to keep this as a first-class feature.

stefano-pogliani commented 6 years ago

Well with that said :smile: shall we look at the API?

I followed a bottom up approach following the opentracing specification:

SpanContext trait (only support for iterating over baggage items in the base trait).
Span trait (with logs and tags types).
Tracer trait (with carriers and formats).

@daschl do you think this approach would work? Shall I start a PR with a simple SpanContext trait definition?

daschl commented 6 years ago

yes let's start with the SpanContext. From the call last week in the opentracing group the word basically got out that java is currently the most up-to-date one and the other bindings will be updated. So let's take the java one as reference.

I created #6 to track the SpanContext, feel free to take it up. I also have some related code which should be easy to work through (the Tags enum and the Fields enum) #7

rbtcollins commented 6 years ago

One thing that doesn't make sense to me - the drafts you have seem to have everything on heap and consumed via trait objects, which is undesirable for high performance stacks. Rust's zero-cost idioms should be flexible enough to not require this. I'm going to guess that the underlying desire is to have purely runtime hooking-in of new drivers.

Currently most things in the rust ecosystem are designed to be built locally by the apps using them: the pattern with cargo is to download source and build an optimised-for-your-compiler-and-settings build, rather than 'here have a binary blob that has the right entry points'.

So I'd suggest we should be saying that that would be the use here (which is how e.g. rustracing_jaeger works). And - if we want to allow a binary distributed application to have binary-injected tracer implementations, we write a specific 'binary driver' backend that would sit alongside native 'jaeger', 'zipkin', 'lightbox' etc tracers, and take care of the indirection itself. A similar thunk could wrap any native tracer up for use via the binary driver.

That would then give the following scenarios:

open source app built for distribution by e.g. Ubuntu: build with the binary driver, distribution also ships all the open drivers it cares for in binary thunks.
open source app upstream: uses features to select the concrete driver to compile for
proprietary app without source accessibility: build with the binary driver
in house app (the common case IMO): use features to build for a specific concrete driver, or the binary driver if using a binary-only driver.

rbtcollins commented 6 years ago

@stefano-pogliani regarding the interop issue: for an entire ecosystem to switch over to a new implementation will require changes end to end - expecting otherwise is to ignore the realities. Consider, for instance, that even if everything is dlopened and in just one language - containers won't have the new implementation available to dlopen until that container is rebuilt. And in a large enterprise noone is going to ship a new thing willy-nilly.

In fact, wearing a project mgmt hat for a second I'd expect migration from tracer A to tracer B to look like this:

Bring up the new tracing system.
∀ languages in use write a Y adapter that will trace to two backends at once (perhaps halve the sampling rate to allow for overheads)
When all critical systems are reporting to the new one successfully notify teams they can contract
teams reconfigure to run just the new tracer
delete the old tracer backend components

Doing a coordinated atomic pivot to a new tracer would be hard with just one service that runs processes. Doing it for hundreds or thousands of microservices is going to be a nightmare.

So for the question of 'but everyone has to do something to use a new tracer' - my expectation is yes, yes they will. Should we make that as easy as possible? Sure. But lets not compromise on the common case - which is steady state, one tracer in use system wide, and that should be reliable, debuggable and fast.

stefano-pogliani commented 6 years ago

Hi @rbtcollins,

I believe we agreed with @daschl that the API will be based on generic traits and not trait objects. This will allow for the use case of programs picking a tracer at compile time and have no heap costs for tracing. A separate utility would make use of trait objects to allow opting in to runtime tracer selection, at the cost of heap allocations.

For in-house applications this would add no value but for "out of the box" applications (e.g. database software) forcing every user to re-build the entire system with their tracer of choice seems ... harsh.

My original use case (which rustracing could not easily accomodate IMHO) is an infrastructure tool. In that case the list of available tracers is compiled in the final binary and is fixed (e.g. zipkin and jaeger are supported but to add lightbox the software would need to be re-compiled). The runtime aspect would only be the selection tracer to "activate" (Alice's company uses zipkin while Bob's uses jaeger, neither wants to re-compile third-party software or run multiple tracers for them). It was never my suggestion that we would allow dynamic linking of tracers in any way.

As for the interop issue: changing the tracer was not the use case I had in mind. The issue is allowing tools and applications built with OpenTracing to integrate with end-users' ecosystems without having to recompile everything to their needs. While it may not be a big deal for enterprises to recompile their postgresqls and mongodbs (especially for those enterprises that build their database software) it is a big cost for small companies to do so.

I'm working on a PR for the SpanContext trait which I am hoping can show how both efficient, compile-time, tracer selection and heap-based, run-time, tracer selection live side-by-side.

rbtcollins commented 6 years ago

Hey, ok cool thats good, sounds like my concerns are addressed.

With respect to the infrastructure tool - you mean you want a precompiled binary that supports a fixed set of tracers at runtime? that sounds like either internally-boxed or an internally routing implementation of the traits would do : both of which should fit easily as layers on the API rustracing has. I'm likely missing some nuance though!

I didn't mean to suggest that it was easy for anyone to rebuild everything - far from it. What I meant to suggest was that whatever the usual integration method for a particular component was, that that would be sufficient because there are always additional layers and vetting and coordination that take place in production environments.

ddcprg commented 3 years ago

Is this project still alive?

stefano-pogliani commented 3 years ago

Hi @ddcprg.

I believe https://opentelemetry.io/ is a new iteration on OpenTracing and OpenMetrics as an all in one. They also have a rust sdk: https://opentelemetry.io/docs/rust/

daschl commented 3 years ago

Indeed, OpenTelemetry is the way to go these days and the rust implementation is a good working progress over there.

opentracing / opentracing-rust

Basic API Discussion #2