getsentry / sentry-python

The official Python SDK for Sentry.io
https://sentry.io/for/python/
MIT License
1.93k stars 509 forks source link

fix: remove inconsistent PII filtering from rust_tracing integration #3773

Closed matt-codecov closed 6 days ago

matt-codecov commented 1 week ago

the PII filtering in the merged version treats span data set in on_new_span() as safe but span data set in on_record() as sensitive. in actuality, they're two different ways to set the same thing, so they should be treated the same

below i lay out my reasoning for removing this PII filtering. there is a tiny bit more work to do either way:

(the aforementioned docs branch)

the full scoop

some detail on `tracing` behavior the `#[tracing::instrument(fields(extra1=5, extra2)]` attribute at the top of a function adds all of the function's arguments + extra fields listed in the attribute arguments as span data. when `on_new_span()` is called it will include all of the function arguments and fields, with one exception: fields without default values are not "recorded", so `extra2` in this example will be omitted. during the span, `Span::current().record(field, val)` can be called to set a value for `extra2` after the function starts. it can also overwrite a value for an already-assigned field. `on_record()` will be called with the recorded fields/values. to summarize, `on_new_span()` and `on_record()` are setting the same data, but `on_new_span()` sets the data to its default value at the beginning of the function and `on_record()` can set the data to new values after the function has begun.

the data we get from tracing amounts to stack-local variables and log statements which the Python "Scrubbing Sensitive Data" page list in the section for before_send / before_send_transaction. the python SDK also has event_scrubber for this.

additionally, this integration is basically a port of the Rust SDK's tracing integration which does not use the Rust SDK's version of should_send_default_pii() in this way (on_record() link). the Rust "Scrubbing Sensitive Data" page also describes tracing data in the section for before_send / before_send_transaction

it seems our SDKs typically don't use all-or-nothing filtering for data like this. instead they allow the user to provide targeted filtering at the SDK level with before_send/before_send_transaction/event_scrubber. this PR takes that approach for this integration

matt-codecov commented 1 week ago

the web framework CI failures do not appear related to my change but maybe they will go away with a re-run or a merge later

matt-codecov commented 6 days ago

replaced by https://github.com/getsentry/sentry-python/pull/3780