Feature request: Flexibility in time calculation/Unit/Define timestamp of start/end

Dant3s commented 1 year ago

Use case:

Log for a system based in ancient language, imposible to instrument in any kind of tracing system. Each entry of the log represent the start of a transtacion/span and later the end.

Today the duration is based in seconds, but for this kind of situation of quick based tx (ms as max value) or "true to the source" requirment, the timestamp of the original span/transaction in format "%Y-%m-%d %H:%M:%S.%f" could be used for start/end without the need of calculation.

Checking the span configuration the units defined are in unix_nano. Could be usefull to just pass the timestamp and do the convertion only to unix_time, without a need to pass a duration and leaving the start/end as an input in case it exists.

Possible solutions:

Add chance to handle other kind of unit than seconds.
Add chance to define start/end timestamp of the span and do the unix_calculation based on that.

agardnerIT commented 1 year ago

Hi, thanks of the feature request. So you need the ability to be more granular than just "seconds" for the duration? Are the "transactions" in your system "one-off" events that could be described as a single log line entry or are they "multi-step" where a "parent" starts at time 0 then 1 or more child things happen within (or because of) the parent and then the whole things ends?

Have you looked at logpusher?

An example of this "one shot" would be "a file was created and it took 2500ms
An example of this parent / child would be a web transaction hitting multiple microservices

Do you really need sub-second accuracy or would "to the nearest second" be enough?

Dant3s commented 1 year ago

Hey! Each line could be a transaction or span. But is defined as If the line is a new transaction then all the other lines are spans until the close of the current transaction, and each span could have it's own start and maybe an end. The logic for this was already solved it with a python script to "understand" the start of the transaction and spans. So there is not need to do anything there with tracepusher. Only invoke it when i want to send the data (at the end of the span/transaction.

Problem for the second accurancy is mostly since all / or most spans are fast (ms differations) most of them end with the same second roundup and also, that the transaction start time is actually created not in the "real" start time, but with the calculated time based in the current time of the tracepusher execution plus the delta of the seconds duration, that could be round up to 0 after my testings.

The idea of this is actually process transactions in wich the only information resides in logs lines and conver it to traces/spans for a more "in the line of view" of the information that represents and move it from the concepto of a simple log. That is why the logpusher wasn't considerer as an option.

Now honestly, since this was my request I decided to peak a little into tracepusher and during the night i hacked a little bit to allow such use cases:

Changing the duration in sec as optional
Adding the possibility to define the start/end in unix_time as an argument. Is not fair to pass a time data for wich a lot of format exists, so might be in the best interest to just pass the unix_time data and remove that calculation from tracepusher.
Adding the possibility of define the SPAN kind. (This is since my testing backend is one where we show transactions in services only in SERVER or CONSUMER kind :) )
I haven't done it, but would need to add a validation that Duration or Start/End exists as an argument.

The code is a little hacky, since it's been a while since I did python, need to clean a little. But if is up to your standar i could request a merge and add those capabilites for anyone else that might be interested in them.. (not right away).

Also a quick question... how you see something like tracepusher in a production enviroment?

agardnerIT commented 1 year ago

Thanks for the background. It seems like the simplest option is to add a new parameter called (something like) duration_type which will default to seconds if not set. This would maintain backwards compatibility. But if you could specify:

./tracepusher -dur 3500 -duration-type ms

They tracepusher would know you're sending milliseconds.

Would this solve things?

wrt to tracepusher in a production environment, I don't see any issue.

https://github.com/agardnerIT/tracepusher/blob/cfec80193bd0d4a2770989a32f941c53670af406/tracepusher.py#L271

Adding a new optional input parameter for span type should also be no problem. Again, if not specified this should default to the current behaviour of SPAN_KIND_INTERNAL.

https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto#L121-L147

It should be expected that valid values are:

SPAN_KIND_INTERNAL
SPAN_KIND_SERVER
SPAN_KIND_CLIENT
SPAN_KIND_PRODUCER
SPAN_KIND_CONSUMER

If these two changes:

Adding optional time duration type
Adding span type

Would solve your use case, LMK and I'll raise them as enhancements.

Dant3s commented 1 year ago

Yes, that will do the trick!

agardnerIT commented 1 year ago

Can you try running tracepusher with the binaries built here: https://github.com/agardnerIT/tracepusher/tree/add-duration-type/dist

The new parameters are:

-dt or --duration-type which accepts ms or s and if not provided, it is backwards compatible and defaults to s
-sk or --span-kind which accept INTERNAL, CLIENT, SERVER, CONSUMER or PRODUCER. If not provided, it is backwards compatible and defaults to INTERNAL.

For example, to push a CONSUMER span with a duration of 1234 milliseconds:

./tracepusher -ep http://localhost:4318 -sen serviceA -spn span1 -dur 1234 -dt ms

LMK how you go. If these binaries work for you, I'll include in main.

agardnerIT commented 1 year ago

@Dant3s did you have a chance to try this out?

Dant3s commented 1 year ago

Hey @agardnerIT Yes, sorry for the late reply, I tested without my changes and this does align with the idea! The output is in the MS is so sweet! Binary is also a nice solution!

agardnerIT commented 1 year ago

Perfect. I'm closing this as this is available in main now and will be "official" as of 0.8.0.

Docs:

agardnerIT / tracepusher

Feature request: Flexibility in time calculation/Unit/Define timestamp of start/end #51