Open timmc-edx opened 1 week ago
Other notes:
6f17017-5957
Additional thoughts, questions, ideas:
Currently, we're investigating if using a NR Free Tier account for edxapp is enough to get DD traces working.
Other possibilities may include trying to get tracing (or APM) disabled everywhere in Edge. This includes where Spans were found in the last day:
prod-edge-edxapp-cms 9.28 M
prod-edge-analytics-api 2.35 M
prod-edge-notes 2.3 M
prod-edge-edxapp-workers-lms 755 k
prod-edge-forum 675 k
prod-edge-edxapp-workers-cms 188 k
[idea] We might want 3 modes for our hacked NR agent:
Ultimately, this ticket is for disabling New Relic APM across edxapp. We ran into trace related issues in DD when first attempting to disable NR APM. We later caused the same issue in Edge when simply disabling NR Tracing.
Acceptance criteria
DD_TRACE_HEADER_TAGS
andDD_DJANGO_INSTRUMENT_MIDDLEWARE
were added in https://github.com/edx/configuration/pull/41operation_name:django.request
on All Spans since service entry spans were unreliable. Maybe we want to change that back, or maybe not.Details
When we disabled NR APM in edxapp on June 6 we observed two anomalies with traces:
service:edx-edxapp-lms env:prod
dropped precipitously by 2-3x.However, we believe the actual traffic was unchanged. This is corroborated by the Django hit metrics remaining steady, as seen in the Service Catalog. We cannot find any relevant code or config changes that would have been deployed around that time.
Our current understanding is that the majority of Django web requests that are traced are not recorded as service entry spans, but are instead parented to a different trace. This causes several problems:
We can also reproduce this issue by setting "Tracing type: None" in the application settings in NR (usually set to Distributed Tracing).
Links