Closed yogiraj07 closed 2 years ago
One suggestion:
Service
would be Reporter
because indeed it reports data to the backend.One question:
- Will the agent do any sort of transformation with received data or will deliver data to the Service
interface in the same way it receives it? The current code does nothing with data more than sending it directly to the backend, I think this is something that could be mentioned here.
Otherwise, great initiative @yogiraj07.
Hi @jcchavezs , Thank you for the feedback.
I think, we should deliver data to Service
interface (name not yet finalized), in the same way we receive it. Let the implementer of the interface handle the desired transformations.
However, this decision is also based on what kind of transformations, user expects and at what point of time in the pipeline. For example, is it during batching segments, or once the batch is ready to be sent, we do transformation on the batch.
Please let us know your motivation for the transformation on received data.
Best, Yogi
I was more asking from the aws
side, if we deliver data same way we receive then it will be easier to start using the agent with different formats :+1:. The no-transformation also opens the possibility for other encodings like msgpack
for example.
The only transformation I can think of is the joining of jsons to report a batch over http. Let say you receive: [{"key":"value1", ...}, {"key":"value2", ...}]
and in a second moment [{"key":"value3", ...}, {"key":"value4", ...}]
, you will most likely join them together and send to the server as [{"key":"value1", ...}, {"key":"value2", ...}, {"key":"value3", ...}, {"key":"value4", ...}]
but that sort of combination could be left to the Service
implementation.
Other sort of transformation could be dropping segments or traces based on different criterias (for example on firehose
mode), let's say you only want to send traces with an error or with longer durations than a certain number, again that should be done in Service
implementation.
Hi @jcchavezs ,
Can you please help me understand the following statements: 1)"Using other encoding formats like msgpack" Do you mean, a possible transformation of received segments to msgpack format by the implementer of Service interface. For example, the following flow :
Received segment -> service interface -> transform to msgpack format -> send to desired backend (other than X-Ray service)
2)"Using agent with different formats" Is the input to X-Ray daemon in different format, or the output? Or is this the same concept covered in point 1 or you intended to say something else.
Please let me know if I am missing anything.
And for other mentioned transformations, we can let the implementer of Service interface decide.
Thanks, Yogi
1)"Using other encoding formats like msgpack" Do you mean, a possible transformation of received segments to msgpack format by the implementer of Service interface. For example, the following flow Received segment -> service interface -> transform to msgpack format -> send to desired backend (other than X-Ray service)
Exactly that.
2)"Using agent with different formats"
If the agent might do some sort of validations instead of send what it receives it won't allow users to send data as msgpack or any other format. This is not so important.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in next 7 days. Thank you for your contributions.
Since the OpenTelemetry Collector is better suited for this goal of sending traces to different backends, and it already has the awsxrayreceiver
and awsxrayexporter
, closing this issue. OpenTelemetry Collector is a good solution for user wanting to export the data to other backends.
Goal
Currently, the X-Ray daemon sends data to the AWS X-Ray service. This issue discusses the changes to be implemented to the existing design that supports multiple backends apart from the X-Ray service
Current design
The X-Ray daemon receives segments on the X-Ray daemon address. Each received segment has a daemon header. The current design utilizes a global memory pool known as buffer pool, (preallocated on initialization, default 1% of total memory) for receiving the UDP payload. A Ring buffer (RB) is a structure implemented using a channel and stores received segments using a goroutine. The size of the RB is 250 segments and each segment in the RB maintains a pointer to a piece of buffer allocated in the buffer pool. By default the buffer size is 64KB and we do not split large payload into multiple buffers. A
Processor
is on the receiver end of this RB channel and batches segments using a goroutine. A batch is ready to be sent by the processor to aBatch Processor
, if it is large enough (default: 50 segments) or the processor goroutine has hit an idle timeout (default: 1 second), upon which the raw payload for the batch is serialized to strings and the buffer is returned to the buffer pool for reuse. The batch processor uses X-Ray client and sends batches to the X-Ray service using the PutTraceSegments API.Modularization
We intend to decouple components of the X-Ray daemon, so the segments batched by the X-Ray daemon can be routed to the desired backend service. The changes to the design are backward compatible and support the X-Ray service by default.
Client
We create a X-Ray client instance to use the PutTraceSegments API that sends data to the X-Ray service. The X-Ray client implements
XRay
interface which contains X-Ray service API methods. We will have another interfaceService
(name yet to be finalized) which containsPutSegments()
method. AClient
structure will implement theService
interface for the desired backend service. TheClient
will be a bridge between the X-Ray daemon and the backend service.Registering Client
In the current design, during initialization of the
Processor
instance, the X-Ray client is created and set to theBatch Processor
instance. When the batch of segments is ready to be sent, theBatch Processor
instance uses the X-Ray client to send data to the X-Ray service. This part needs to be restructured and theClient
/ X-Ray client will be created as a part of daemon initialization and passed to theProcessor
instance. Once theBatch Processor
instance is configured with theClient
/ X-Ray client, existing architecture will send the batch of segments to the configured backend service.Note : These are initial thoughts on modularizing the X-Ray daemon. Your suggestions are welcome.