Simple layer on top of ByteBuffer for BINARY format.

carlosalberto commented 6 years ago

Provide a simple layout to let users get/set the used ByteBuffer, as well as hint them about the ByteBuffer length in the case of injection (as opposed as the current approach, where they need to pass a large-enough ByteBuffer when calling Tracer.inject()).

Observe this is merely a thin layer on top of ByteBuffer, but with the hint for letting the users know the required length of the buffer used by inject().

coveralls commented 6 years ago

Coverage increased (+1.3%) to 82.828% when pulling 31e458b2db817d34ba6f44df73d8c785243a6c4b on carlosalberto:binary_format_proposal_simple into 4c5dfb5010daf32901c57fbc74b3728d603148a8 on opentracing:v0.32.0.

carlosalberto commented 6 years ago

The code was updated to show the suggested improvements @yurishkuro Let me know ;)

carlosalberto commented 6 years ago

@yurishkuro Good point (on the injectionBuffer instead of injectBuffer) ;)

Updated.

carlosalberto commented 6 years ago

@yurishkuro My impression is that nobody has been really using this API so far - still, I think it would be nice to ask on Gitter (at least), in case anybody was (if that's the case, we should definitely deprecate what we have now, etc).

hypnoce commented 6 years ago

I believe this interface puts

more work on the instrumenters side, since it now has to handle the lifecycle of a ByteBuffer and an additional method to implement.
more work on the tracer implementers with an additional method setInjectionBufferLength to use. Also, it does not have the hand on the kind of buffer being created.
does not solve the concerns of @raphw

The previous interface where ByteBuffers were passed to the carrier did solve the first 2 items.

What do you think ?

carlosalberto commented 6 years ago

@felixbarny Oh, if Instana is using BINARY, then we definitely add a new format and deprecate the current one.

On @raphw and his feedback, I'd suggest reading, for historical context:

99 as the initial need for users to be able to not pass a large enough ByteBuffer instance for Tracer.inject().
223, as the original PR where we tried to have a simple streaming API, similar to what Go does/has at this moment (and which was later reverted).
253 which is the Issue referencing all the effort and iteration and considerations (including this PR and #252 . This can be important as @raphw mentioned that at some moment (in #252):

the more I think of it, I believe that it would actually be the easiest to extract a TextMap and push it into the stream depending on the underlying binary protocol. Until there is an actual use case, maybe it makes sense to table the binary extraction?

I had previously in the same thread mentioned the advantages and disadvantages of what the options we had at hand - which leads us to have this simple approach (which I mentioned later on as well):

definitely we need a simple, working approach for now (needed, for example, for instrumenting Cassandra on the server side). @raphw needs a more advanced, streaming-based approach that we could add later on, definitely, once we hit that need ;)

I'd urge you to read that to understand the full picture, so the situation is clear, and why we will go with this very PR for the time being. Else, let us know on your take (either for this PR still, or a future streaming based API ;) )

carlosalberto commented 6 years ago

@hypnoce Hey, setInjectionBufferLength() serves for users asking to have a variable-size ByteBuffer approach (see for example #99).

As the other questions (will write my answers here, but it sounds we should put them in the docs as well ;) )

more work on on instrumenters side, since it now has to handle the lifecycle of a ByteBuffer and an additionnal method to implement.

Related to the previous point - and users dont have to handle the lifecycle of a ByteBuffer - we simply provide a default adapter for users to want to have a new ByteBuffer created for simple calls (maybe for MockTracer). In a real world scenario, the user would have a ByteBuffer pool and have a Binary implementation like this (snippet):

class CustomBinary implements Binary
{
  public void setInjectionBufferLength(int length)
  {
    this.buffer = CustomByteBufferPool.getWithLength(length);
  }
  public ByteBuffer injectionBuffer()
  {
    return this.buffer;
  }
}

more work on the tracer implementers with an aditionnal method setInjectionBufferLength

Implementing an extra method is less work than implementing, for users, a full OutputStream or InputStream that has been described in #243, for example :) Also, it serves a specific purpose (and for Tracer.extract(), it should be straightforward to use our adapter (which means: no need to write a class for it):

Byte buffer = ...;
SpanContext ctx = tracer.extract(BinaryAdapters.extractionCarrier(buffer));

For the third point, see my answer to @felixbarny on why we decided to have that approach (streaming) as a separate, potential new FORMAT for the future.

The previous interface from @yurishkuro did solve the first 2 items.

Which one? #252? I think that's a fine API too, but this is simpler and would work just fine for simpler cases.

tedsuo commented 6 years ago

Glad to see interest in this picking up! I'm pinging more people to make them aware of this thread.

In the meantime, I would strongly suggest moving to tests and runnable code examples for illustrating various scenarios/points-of-concern. The ability for english/prose to move these discussions forwards becomes very limited once we get down into the weeds.

tedsuo commented 6 years ago

I would also like to suggest that there may be no "universal" binary solution that would satisfy every case effectively. In Java for example, we may want one solution for situations that require a ByteBuffer, and a separate solution for streaming, rather than a single solution. If a single underlying solution can be made efficient, with higher-level utilities being employed to make it ergonomic for various scenarios, that would be great. But if that seems too difficult, consider that we could instead have several separate solutions.

hypnoce commented 6 years ago

@carlosalberto Thanks for your answer !

In a real world scenario, the user would have a ByteBuffer pool and have a Binary implementation like this

Each middleware instrumenter would have an implementation of a ByteBuffer pool instead of the tracer implementation to handle it. Therefore, since a single JVM may host different middlewares but a single tracer implementation, you may endup using many ByteBuffer pools with variable degrees of stability. As a middleware instrumenter, I would rather have the tracer to handle the pooling instead of me and the others since we will endup in many ways of implementing the same behavior. Hope you see the point here :).

Implementing an extra method is less work than implementing, for users, a full OutputStream or InputStream

Totally agree

Hey, setInjectionBufferLength() serves for users asking to have a variable-size ByteBuffer approach

Indeed, when the ByteBuffer is supplied by the instrumenter, then we need such method. When the BB is supplied by the tracer, then I believe this method disappears.

Which one? #252? I think that's a fine API too, but this is simpler and would work just fine for simpler cases.

Exactly. In what aspect do you think the new proposed one is simpler ?

In the end, I would emphasise that java.nio package is not asynchronous IO. It's New-IO and therefore does not enforce asynchronicity. Therefore, exposing ByteBuffer makes total sense in case of blocking IO.

Thanks for your great work !

carlosalberto commented 6 years ago

Hey @hypnoce, thanks for the message.

In what aspect do you think the new proposed one is simpler ?

I think this one is barely a thin layer on top of the current approach (a bare ByeBuffer object). Also, the other implementation you mentioned could become the future BINARY_STREAM one if/when needed (or else @raphw approach of working with actual Java IO streams). This would keep them separated.

As a middleware instrumenter, I would rather have the tracer to handle the pooling instead of me and the others since we will endup in many ways of implementing the same behavior.

Not sure Tracer implementation is the best place to put this, and not sure they would love it ;) To me it sounds we should provide some ByteBuffer pool in some opentracing-contrib repo in the near future, so other instrumentations could use it. This way you could import such pool and do:

import io.opentracing.contrib.BinaryPool;

// returns an object providing a ByteBuffer in the pool, based on a given length.
Binary binary = binaryPool.getInstance();

// Do the actual injection.
tracer.inject(ctx, Format.BINARY, binary);

// Use the result (provided by the pool)
ByteBuffer buffer = binary.injectionBuffer();
buffer.rewind();
request.setPayload(buffer);

This way, we provide a safe place to start - and later people can use their own implementation if they need to. Thing this would work for you? Let us know ;)

carlosalberto commented 6 years ago

Hey @CodingFabian I've got a question for you ;)

Is Instana is using the current BINARY format? Let us know, as we are moving towards a new API (more or less soon ;) ). Depending on your needs, we could keep the current one (as Deprecated) and create this mentioned new format under a new field (BINARY_LAYER or similar ;) )

Thanks in advance!

CodingFabian commented 6 years ago

well some of our implementations support it, not sure if any user uses it. there are actually many open issues with the binary format, so I am not sure.

hypnoce commented 6 years ago

@carlosalberto thanks for your answer I understand the rationales and it's a good idea to create a default Binary pool implementation in the contrib section.

I'm still not really confortable with the setInjectionLength(int length) method. It is very coupled with injectionBuffer().

Maye I would rather replace it by an overloading of injectionBuffer() :

public interface Binary {

    ByteBuffer injectionBuffer();

    ByteBuffer injectionBuffer(int length);

    ByteBuffer extractionBuffer();
}

I beleive it makes the Binary interface easier to understand just by reading the methods. Also, It does not impose some ordering constraints on the calls.

What do you think ?

carlosalberto commented 6 years ago

@hypnoce

Hey! So I had previously played with such approach (injectionBuffer(int length)), but abandoned it as the user couldn't easily/clearly retrieve the underlying buffer (what should happen if the user calls it again? clear the existing buffer? create a new one? what if another length is passed? etc). Having a simple getter for the underlying buffer was specially helpful for BinaryAdapters.injectionBuffer(), as it simply created a buffer on demand ;)

ByteBuffer injectionBuffer(); ByteBuffer injectionBuffer(int length);

IHMO this is confusing - one of them is used for creation, and another one as a getter, but with the same name... I'd rather stick with (as mentioned):

ByteBuffer injectionBuffer(int length); ByteBuffer extractionBuffer();

And the user could access the underlying buffer depending on the actual implementation they have (CustomBinary.buffer() or similar).

(In this case we could remove the BinaryAdapters.injectionBuffer() overload in favor of a simple one wrapping a buffer, injectionBuffer(ByteBuffer), throwing an exception in case the remaining bytes are not enough for the requested length (of course, the users could either provide their own implementation of Binary, or they could use the pool we will be creating soon under contrib ;) ) )

I'm fine with this slight change - let me know what you think about this one.

cc @yurishkuro ;)

carlosalberto commented 6 years ago

@CodingFabian Hey, thanks for the answer :)

So I'd personally prefer to break the BINARY format here - but, if you think it's better (for you and your users) to add this new iteration under a new field, and simply deprecate the current one, let us know ;)

CodingFabian commented 6 years ago

as long its documented, I would guess the few users using it would find the doc.

hypnoce commented 6 years ago

@carlosalberto This change looks nice to me :)

Do you think it's a good thing to document thrown exceptions or it's implementation dependant ?

Thanks !

carlosalberto commented 6 years ago

@hypnoce Oh, definitely we should document the exceptions (in the BinaryAdapters methods in this case).

Will update the PR in the next few days. Thanks for the feedback!

tedsuo commented 6 years ago

There's still room to change this once it's merged into the release candidate branch. But I would like to get feedback from @felixbarny and @raphw before doing so, to make sure their concerns have been addressed. Maybe a call in CET time would be quicker?

raphw commented 6 years ago

I am currently on vacation and cannot follow this up but my concerns are the same as I initially mentioned. The current API would require to manage byte buffer pools both by the tracer and the user of a tracer. Also, the buffers must be sized by the required bytes for the trace information what makes reuse difficult.

carlosalberto commented 6 years ago

Hey @raphw

The current API would require to manage byte buffer pools both by the tracer and the user of a tracer. Also, the buffers must be sized by the required bytes for the trace information what makes reuse difficult.

The current API (which simply uses a ByteBuffer) is not much better than PR. Also, few applications simply require this (like Cassandra on the server side, which receives/gets a ByteBuffer as a binary payload).

But as Ted mentioned, I think it would be nice to have an actual call to talk about a binary, stream based, more complex format (separated, such as BINARY_STREAM) - whenever you are back from holidays, that is ;)

carlosalberto commented 6 years ago

@hypnoce Hey, I have updated the PR to remove setInjectionBufferLength(). Let me know ;)

hypnoce commented 6 years ago

Hey @carlosalberto the changes look good to me. Thanks !

carlosalberto commented 6 years ago

Hey @raphw @felixbarny before going ahead and merge this PR, we would like do cover this issue/PR up in the next Cross-Language group call on July 13th (next week).

Observe the initial plan is to have this as the simple way to do binary propagation, but we would like to get some feedback (part of that is to work on some more advanced BINARY_STREAMING format). Let me know if you can make it to the call ;)

carlosalberto commented 6 years ago

So I think we will be merging this onto the v0.32.0 branch next week.

@raphw if you are still around, would be nice to schedule a Cross-Language call to talk about a more advanced BINARY_STREAM format (which most of us can't properly design/test as we are not experts in the area 😞 ) If not, I will create an Issue, to have it a reminder of this potential format.

@CodingFabian Final take on whether we could replace the existing BINARY format with this one, or should we add it as a new one, to not break code? ;)

CodingFabian commented 6 years ago

@carlosalberto i am fine with breaking as we did not implement this anyway. Thats my vendor perspective. My user perspective is that this is still probably not usable.

carlosalberto commented 6 years ago

Hey @CodingFabian thanks for the answer.

i am fine with breaking as we did not implement this anyway My user perspective is that this is still probably not usable

You mean the current or this proposed one? ;)

carlosalberto commented 5 years ago

Hey all,

Trying to get revive this PR - so essentially this approach is simply enough, yet handles an important case (telling the user in advance the required size of the buffer).

In addition, I'd like to address that while prototyping Cassandra server side support, I saw the payload is received as a straight ByteBuffer (https://github.com/carlosalberto/java-cassandra-server/), and while prototyping Hadoop integration (https://github.com/carlosalberto/hadoop/tree/ot_initial_integration), I realized that in such case it may be helpful (on top of Protocol Buffers).

The arguments in favor we had before were: keeping it as a simply, yet more realistic approach to the one we have now; and against it, probably the need of a more advanced format (which could be addressed by a 'streaming' one in the future).

Thoughts?

carlosalberto commented 5 years ago

Hey @CodingFabian

In case we decide to merge this PR for doing a RC to test out a few new features, I'm wondering if Instana needs this new format to be exposed with a different name, in order to not break things for your Java client? Let me know ;)

carlosalberto commented 5 years ago

Hey all,

I'd like to merge this PR by tomorrow, in order to include it in an incoming RC - so feel free to provide any feedback in case you think it needs tuning or shouldn't be merged even ;)

(And my understanding, upon re-reading the conversation, is that it's fine to break Instana' BINARY tracer support ;) )

carlosalberto commented 5 years ago

Merged this PR - feel free to open an issue if you feel something needs tuning (I will be putting an eye on that, and will update accordingly)

Thanks all for your feedback!

opentracing / opentracing-java

Simple layer on top of ByteBuffer for BINARY format. #276

99 as the initial need for users to be able to not pass a large enough `ByteBuffer` instance for `Tracer.inject()`.

223, as the original PR where we tried to have a simple streaming API, similar to what Go does/has at this moment (and which was later reverted).

253 which is the Issue referencing all the effort and iteration and considerations (including this PR and #252 . This can be important as @raphw mentioned that at some moment (in #252):

opentracing / opentracing-java

Simple layer on top of ByteBuffer for BINARY format. #276

99 as the initial need for users to be able to not pass a large enough ByteBuffer instance for Tracer.inject().

223, as the original PR where we tried to have a simple streaming API, similar to what Go does/has at this moment (and which was later reverted).

253 which is the Issue referencing all the effort and iteration and considerations (including this PR and #252 . This can be important as @raphw mentioned that at some moment (in #252):

99 as the initial need for users to be able to not pass a large enough `ByteBuffer` instance for `Tracer.inject()`.