the PII filtering in the merged version treats span data set in on_new_span() as safe but span data set in on_record() as sensitive. in actuality, they're two different ways to set the same thing, so they should be treated the same
below i lay out my reasoning for removing this PII filtering. there is a tiny bit more work to do either way:
if we instead do want all-or-nothing PII filtering on by default to be conservative, i will update this PR to apply the filtering to both on_new_span() and on_record() and then update my docs branch to document the send_sensitive_data argument and filtering behavior
some detail on `tracing` behavior
the `#[tracing::instrument(fields(extra1=5, extra2)]` attribute at the top of a function adds all of the function's arguments + extra fields listed in the attribute arguments as span data. when `on_new_span()` is called it will include all of the function arguments and fields, with one exception: fields without default values are not "recorded", so `extra2` in this example will be omitted.
during the span, `Span::current().record(field, val)` can be called to set a value for `extra2` after the function starts. it can also overwrite a value for an already-assigned field. `on_record()` will be called with the recorded fields/values.
to summarize, `on_new_span()` and `on_record()` are setting the same data, but `on_new_span()` sets the data to its default value at the beginning of the function and `on_record()` can set the data to new values after the function has begun.
the data we get from tracing amounts to stack-local variables and log statements which the Python "Scrubbing Sensitive Data" page list in the section for before_send / before_send_transaction. the python SDK also has event_scrubber for this.
it seems our SDKs typically don't use all-or-nothing filtering for data like this. instead they allow the user to provide targeted filtering at the SDK level with before_send/before_send_transaction/event_scrubber. this PR takes that approach for this integration
the PII filtering in the merged version treats span data set in
on_new_span()
as safe but span data set inon_record()
as sensitive. in actuality, they're two different ways to set the same thing, so they should be treated the samebelow i lay out my reasoning for removing this PII filtering. there is a tiny bit more work to do either way:
pyo3-python-tracing-subscriber
to call out thetracing
docs for excluding specific fields that the developer knows are sensitiveon_new_span()
andon_record()
and then update my docs branch to document thesend_sensitive_data
argument and filtering behavior(the aforementioned docs branch)
the full scoop
some detail on `tracing` behavior
the `#[tracing::instrument(fields(extra1=5, extra2)]` attribute at the top of a function adds all of the function's arguments + extra fields listed in the attribute arguments as span data. when `on_new_span()` is called it will include all of the function arguments and fields, with one exception: fields without default values are not "recorded", so `extra2` in this example will be omitted. during the span, `Span::current().record(field, val)` can be called to set a value for `extra2` after the function starts. it can also overwrite a value for an already-assigned field. `on_record()` will be called with the recorded fields/values. to summarize, `on_new_span()` and `on_record()` are setting the same data, but `on_new_span()` sets the data to its default value at the beginning of the function and `on_record()` can set the data to new values after the function has begun.the data we get from
tracing
amounts to stack-local variables and log statements which the Python "Scrubbing Sensitive Data" page list in the section forbefore_send
/before_send_transaction
. the python SDK also hasevent_scrubber
for this.additionally, this integration is basically a port of the Rust SDK's
tracing
integration which does not use the Rust SDK's version ofshould_send_default_pii()
in this way (on_record()
link). the Rust "Scrubbing Sensitive Data" page also describes tracing data in the section forbefore_send
/before_send_transaction
it seems our SDKs typically don't use all-or-nothing filtering for data like this. instead they allow the user to provide targeted filtering at the SDK level with
before_send
/before_send_transaction
/event_scrubber
. this PR takes that approach for this integration