edx / edx-arch-experiments

A plugin to include applications under development by the architecture team at edx
GNU Affero General Public License v3.0
0 stars 3 forks source link

Disable django db and cache instrumentation for edxapp #761

Open robrap opened 3 months ago

robrap commented 3 months ago

We should consider disabling the following, in order to drop the duplicate spans created for the db and cache:

Notes:

robrap commented 3 months ago

@timmc-edx: Let me know if you end up implementing this on another ticket and leaving it in place. Thanks.

timmc-edx commented 3 months ago

I was curious about what span tags are different between the Django spans and the library spans. Here's what I found in a recent pair of spans:

The only potentially valuable thing I see on the defaultdb span is db.instance, but I think each of our webapps only talks to at most one instance of each kind of database.

timmc-edx commented 3 months ago

Here's a pair of memcache spans, with common values removed. These are much more different from each other than the MySQL spans were:

service:django
operation_name:django.cache
resource_name:"django.core.cache.backends.memcached.get default"

{
  "component": "django",
  "db": {
    "row_count": 1
  },
  "django": {
    "cache": {
      "backend": "django.core.cache.backends.memcached.PyMemcacheCache",
      "key": "[REDACTED]"
    }
  },
  "language": "python",
}
service:memcached
operation_name:memcached.command
resource_name:get

{
  "component": "pymemcache",
  "db": {
    "row_count": 0,
    "system": "memcached"
  },
  "duration": 2148268,
  "language": "python",
  "network": {
    "destination": {
      "ip": "[REDACTED]",
      "port": [REDACTED]
    }
  },
  "span": {
    "kind": "client"
  },
}

In particular the presence of the cache key in the Django-level span makes it useful enough that I'd be reluctant to get rid of it.

robrap commented 3 months ago

The only potentially valuable thing I see on the defaultdb span is db.instance, but I think each of our webapps only talks to at most one instance of each kind of database.

Additionally, if we lose the service name and remap these to edx-edxapp-lms, and if we dropped the Django DB (ORM) spans, I think we'd lose whether we connected to defaultdb or the read-replica, but I don't see that as a wildly big deal. It also is not critical that we drop these extra spans either.

UPDATE: Unfortunately, if we remap the DB spans, the two share the same operation_name, so they may look very similar in a trace when the old service name (e.g. mysql vs defaultdb) won't be visible.