tokio-rs / tracing

Application level tracing for Rust.
https://tracing.rs
MIT License
5.19k stars 675 forks source link

Naming issues for span structured data. #1423

Open najamelan opened 3 years ago

najamelan commented 3 years ago

Bug Report

Version

test_tracing v0.1.0 (/data/doc/code/test_tracing) ├── tracing v0.1.26 │ ├── tracing-attributes v0.1.15 (proc-macro) │ └── tracing-core v0.1.18 └── tracing-subscriber v0.2.18 ├── tracing v0.1.26 () ├── tracing-core v0.1.18 () ├── tracing-log v0.1.2 │ └── tracing-core v0.1.18 () └── tracing-serde v0.1.2 └── tracing-core v0.1.18 ()

Platform

Linux desktop.home 5.12.8-arch1-1 #1 SMP PREEMPT Fri, 28 May 2021 15:10:20 +0000 x86_64 GNU/Linux

Description

Some of the first field names for structured data where type and name. I ran into some quirky issues with both, so I wanted to report those here and raise the question as to whether this is desired behavior.

Since type is a reserved keyword, my first attempt was r#type.

use tracing::* ;

fn main()
{
    let _sub = tracing_subscriber::fmt::Subscriber::builder()

        .json()
        .init()
   ;

   let _span = error_span!( "my_span", r#type = "i32", r#name = "Bobby" ).entered();

   info!( "shak yaving?" );
}

Produces:

{"timestamp":"Jun 06 21:39:20.690","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","r#type":"i32","name":"my_span"},"spans":[{"name":"Bobby","r#type":"i32","name":"my_span"}]}

without json:

Jun 06 21:53:26.825  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

Using quotes

   let _span = error_span!( "my_span", "type" = "i32", "name" = "Bobby" ).entered();

Json:

{"timestamp":"Jun 06 22:03:06.993","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","type":"i32","name":"my_span"},"spans":[{"name":"Bobby","type":"i32","name":"my_span"}]}

default fmt:

Jun 06 22:02:52.920  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

I would like to suggest:

najamelan commented 3 years ago

Just looking into this a bit more, serde_json will only consider the last of duplicate keys. This means that in order to correctly extract information here we have to either find an alternative json deserialize which supports duplicate keys, or roll a manual one. ouch!

Would you take a pull request for __span_name? Would this be considered a breaking change? Are there other attributes but span name that get included in the json data with a very generic name?

hawkw commented 3 years ago

Would you take a pull request for __span_name? Would this be considered a breaking change?

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

My two cents would be to use a naming scheme where names are prefixed with tracing_, like tracing_span, rather than __xxx --- tracing_xxx makes it clear where the fields are coming from, But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

najamelan commented 3 years ago

But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

Personally I would say less is more... People using structured data are using tracing, so can be told certain keys are reserved. I think right now what's missing is clear recommendation. For example I'm using tracing in a lib. What can I count on? Will subscribers print span data? I can't count on it. They have the freedom not to, and several do, which means right now I have found no way of having sensical logs on Wasm. And as a lib author I don't control what subscriber get's used even if there was one that worked on Wasm.

Freedom is great, but I think for something like this convention/recommendation would be greater. And I think the tracing API is already very complicated.

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

If a conclusion would be reached, I'd be up for filing a PR. Would it be a config option on the subscriber, or a crate feature? Is there a milestone/date for breaking changes on tracing-subscriber?

jyn514 commented 3 years ago

I would find it useful to be able to make arbitrary changes to the field names instead of having them hard-coded. In particular, an internal tool expects the level to be called log_level instead of level; adding a tracing prefix doesn't help and adding a custom log_ prefix isn't what I want because message should stay the same.

jyn514 commented 3 years ago

That API could look like this:

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn with_renamed_field(original_name: &'static str, new_name: &'static str) -> Self;
}
najamelan commented 3 years ago

@jyn514 Thanks for sharing. I do think it is a bit orthogonal to this issue. They will still need to have a default name, and that default name should still be unique enough so it doesn't overlap with common choices by users.

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

jyn514 commented 3 years ago

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

This is a great idea! That makes it much more flexible, something like this :)

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn inspect(record: &mut serde_json::Value) -> Self;
}