Closed carlosalberto closed 6 years ago
See #253
@carlosalberto I haven't had time to review the whole discussion yet, but does this fully incorporate @raphw's feedback from before?
@tylerbenson Some of it, but we are still not taking into account two things that were requested:
1) Not telling in advance the output size (Tracer.inject()
)
2) Not over-the-wire support, but instead, in-memory buffering is expected (just like HttpHeaders
and TextMap
, where you write everything to some in-memory buffer and after that you pass it to the transmission and let the protocol handle it).
I'd expect, in turn:
1) Some (or a few) proof of concept implementations.
2) Early/experimental Tracer
s support and their feedback on this.
I can only repeate my critique of this API:
Byte buffers are meant to interact with data either on or off heap. This makes byte buffers useful if a server can allow the Tracer to directly serialize or deserialize data from for example a TCP buffer without the JVM needing to also allocate these byte on the Java heap what is expensive in terms of GC. This idea is not reflected by this API proposal and only puts additional weight onto the tracer as most implementations will look like in this example where buffers are controlled by the tracer. A NIO-friendly API must allow for providing ByteBuffer
s from outside of the tracer.
Byte buffers are typically used in the context of asynchrous I/O. An important factor with NIO is that an external buffer such a native TCP storage might not be able to hold all the bytes that belong to the baggage information at once. If a server, from the outside, provides a buffer to the tracer, it cannot normally guarantee that all data is already available. It would rather ask the tracer to do a partial deserialization/serialization to clear the buffer and then suspend the current channel to allow the buffer to fill. In the meantime, it would move on to the next channel from the very same thread to avoid blocking the thread. This is the core idea of NIO and selectors. With OpenTracing being a synchronous API that does not allow for such partial deserialization, the server would need to copy the data into a continous heap array and provide this array once it completed and wrapped as a byte buffer. The current API is therefore equivalent to one that is operating on byte arrays.
There is no binary contract implied. Binary protocols that allow embedding random bytes typically add a marker byte to indicate the start and end of a custom segment. This way, a server can accept baggage data from a client without even having a tracer installed by just cropping the segment. By allowing to write to byte buffers explicitly, the user of a tracer has no chance to escape any payload bytes that equal the marker byte, thus breaking the protocol. The only way to work around this would be to ask the tracer to write to an intermediate buffer and escape the data from there. This is breaking the core assumption of NIO.
None of these problems occurs by offering a stream-based approach:
But even with streams: if the server uses a different tracer implementation than the client, none of this will work to begin with. There is no serialization format defined and if I write the data as BSON but the server expects protobuf, this would result in garbage data being reported. Even worse, this could maybe turned into an attack vector if I find a way to exploit the servers deserialization approach, maybe by forcing it to overconsume data from the buffer what corrupts the remainder of the stream what can have unwanted effects.
@raphw What would you then suggest? Maybe I'm missing something, but writing instrumentation to support propagation over a generic binary interface without breaking the protocol when only one side is instrumented seems to be a significant challenge. Perhaps it would help if we had some specific use cases. What protocols/frameworks are expected to use this binary format? (perhaps protobuf, thrift, grpc?) Maybe it would be better to define those propagation interfaces specifically like we do for http, instead of lumping everything into a binary format? Which formats/protocols are more common when working with NIO? What frameworks should be explored that would highlight these problems?
(Forgive me for my ignorance. I've written a fair bit of instrumentation, but relatively little for systems without generic headers/metadata/parameters that can be used for propagation details.)
@raphw @tylerbenson I'd love to see the requirements for binary format expressed as a set of tests/examples. This worked very well for designing context propagation, where the requirements were similarly nuanced. If we keep discussing in english, we'll probably never resolve this. :)
Binary Protocols typically include such optional sections either by:
The first option requires to know the binary size in advance, the second option requires a way to manipulate any written byte before it is written to a buffer. My original suggestion using streams would allow for that.
Hery everybody - sorry for the delay. Recently I took up on testing the current BINARY
proposal to do tracing for Cassandra on the server side (through a plugin), and the experimental code consuming this specific BINARY
support exists in https://github.com/carlosalberto/java-cassandra-server
Please feel free to test it out. Cassandra itself takes a ByteBuffer
as payload for outgoing query requests, and receives a ByteBuffer
from the server side as well (described in the mentioned repository containing the Cassandra plugin).
I have the impression most of the times, as the SpanContext
tends to be small, the payload will be buffered and transferred in a single step.
That being said, I do realize that @raphw usage may not be supported with the current approach, and that may need an additional format (maybe something like BINARY_STREAMING
).
Let me know ;)
@tedsuo @tylerbenson @yurishkuro
Trying to re-take the discussion: I took a little look at how Netty works, and ended up gathering some overall issues that have been floating during the design of this format. I'm listing them here to help things get moving.
InputStream
/OutputStream
API - however, synchronously injecting the SpanContext
would work just fine, specially taking into account the context tends to be small.ByteBuffer
, and in Thrift you have a binary
type. As you need to buffer the payload in memory before passing/setting it, a stream API wouldn't help much here.InputStream
/OutputStream
API could also let the user know how much space will be used, byte-wise (as a prior method call to do the actual write()
).InputStream
/OutputStream
API would expose too many methods that would be useless for injection/extraction - mostly in the InputStream
(close
, mark
, reset
, skip
, etc). This would go against a clean API to some degree.One thing I wonder is how much time will be spent writing/doing instrumentation for low-level stuff (such as the one exposed by Netty), and how other times it will be done for http or frameworks abstracting the transmission (for these ones we don't need the stream-based API).
Thoughts? Waiting for your feedback @raphw
the more I think of it, I believe that it would actually be the easiest to extract a TextMap
and push it into the stream depending on the underlying binary protocol. Until there is an actual use case, maybe it makes sense to table the binary extraction?
@raphw Thanks for the feedback. Well, I'd say using OT for instrumenting Cassandra on the server side is already an actual case (albeit a simple one ;) ).
@yurishkuro Any opinion on this?
I agree that this is not a pressing matter (strictly speaking "binary" can be just a text_map serialized in a way specific to the protocol being instrumented, like Cassandra), but would be nice to resolve.
Closing this PR in favor of #276 (which was just merged).