Open Veetaha opened 2 years ago
Yea, this is sort of by design right now. I basically accepted the overriding and hoped that tracing users would not use duplicate field names. :P As you observed, LogQL has poor (no?) support for arrays.
Your solution would work to also ship the conflicting fields to Loki, but it would create unpredictable field names, making the field names depend on whether there are conflicting field names and also their call depth.
I assume you stumbled upon this problem in some real use case? Can you give an example so I can see what real problem we're solving? :)
I didn't face this in a real use case, I just wondered how this is implemented in tracing_loki
, and tested this, because I initially wanted to make my app run inside of a span with version = "x.y.z", git_commit = "..."
info, but then dropped this idea, because I always know I am running my app on my server, so there is no need in storing that info with each log event.
Though while writing this comment, I wonder if it makes sense to put that info into labels 🤔, WDYT?
Anyhow, an example of a collision would be creating a span with version = "x.y.z"
for all logs in the application, that contains the version of the binary that is running. Then, suppose we have an optimistic concurrency version from a database record logged as version
field in a log event, or there could be some other semantics applied to another version
field in a log event (e.g. S3 object version ID, or a flavor of ice cream a user requests, or ... anything else).
I understand that this is a rare case, but I do think, that it should be at least explicitly called out in the docs, and there should be an escape hatch in form of mapping the log event as the user of this library wants.
As for the application I am working with. It is my pet project https://github.com/Veetaha/veebot-telegram
that it should be at least explicitly called out in the docs, and there should be an escape hatch in form of mapping the log event as the user of this library wants.
I agree that it should be called out in the docs.
As for your specific example, that is easily worked around by calling these labels veebot-version
and veebot-git-commit
instead, once it's actually documented.
Reproduction steps:
Run the code with loki and grafana for checking how the logs are delivered.
The log event in Grafana will be seen as:
This happens because spans are flattened into a single object. I am pretty sure it is done due to
LogQL
limitation that it doesn't support arrays.Potential Solutions
Unconditionally name all fields that come from spans with the following approach:
span_name
andspan_index
can be omitted if the resulting log event'sfield_name
is entirely unique.span_index
is the number index in the list of spans. It can be omitted if there is no other span with the exact same name in the event. I suppose we can have several spans of the same name if the code that creates spans names them the same, or there were recursive function calls.With this approach, we will optimize for the regular case where it is very unlikely that there are collisions in field names. The downside is that the information from the same span can now be available under different field names dynamically if there are collisions, which makes LogQL queries for the same field name not workable.
Maybe
tracing_loki
could expose ways to format the resulting log event from spans and log event data in a configurable way so that users could decide how they want to resolve conflicts and how to lay out the log events in Loki at all? It could be exposed in terms of a callback that formats a JSON value as the most flexible way to provide this capability. This will also allow renaming the standard fields_file/_line/_module_path
or dropping them if not needed just as users decide.