open-telemetry / opentelemetry-js-contrib

OpenTelemetry instrumentation for JavaScript modules
https://opentelemetry.io
Apache License 2.0
662 stars 489 forks source link

Express instrumentation does not properly handle router usage #1993

Open JamieDanielson opened 5 months ago

JamieDanielson commented 5 months ago

I don't think the current express instrumentation handles usage of a express.Router(). Perhaps change use-express.mjs for now to not use a Router and we add a separate issue about handling Router in instr-express?

_Originally posted by @trentm in https://github.com/open-telemetry/opentelemetry-js-contrib/pull/1966#discussion_r1505871458_

I expected it to show up as router - /post/:id but instead was just router - /

JohannesHuster commented 5 months ago

@JamieDanielson This is related to https://github.com/open-telemetry/opentelemetry-js-contrib/issues/1947

At the moment,http.route attributes of router and middleware spans are set to the "mount path" (in the cases I could see).

// Tests PR example (root mount path)
const routerA = express.Router();
routerA.get('/post/:id', (req, res) => {
  res.send(`Post id: ${req.params.id}`);
});
app.use(routerA);

// Adapted example (non-root mount path)
const routerB = express.Router();
routerB.get('/:id', (req, res) => {
  res.send(`Post id: ${req.params.id}`);
});
app.use('/post', routerB);
Tests PR example (root mount path) Adapted example (non-root mount path)
router span http.route: "/"
name: "router - /"
http.route: "/post"
name: "router - /post"
request span http.route: "/post/:id"
name: "GET /post/:id"
http.route: "/post/:id"
name: "GET /post/:id"
Spans for the example from tests PR (using a minimal express app) ``` { traceId: 'a1c7faafb2135a3d970d31093ee63437', parentId: 'fb4451520d0fa2b5', traceState: undefined, name: 'middleware - query', id: '788673a2a133abce', kind: 0, timestamp: 1710156373205000, duration: 375.252, attributes: { 'http.route': '/', 'express.name': 'query', 'express.type': 'middleware' }, status: { code: 0 }, events: [], links: [] } { traceId: 'a1c7faafb2135a3d970d31093ee63437', parentId: 'fb4451520d0fa2b5', traceState: undefined, name: 'middleware - expressInit', id: '7ece8b4ea1cf76fc', kind: 0, timestamp: 1710156373205000, duration: 372.568, attributes: { 'http.route': '/', 'express.name': 'expressInit', 'express.type': 'middleware' }, status: { code: 0 }, events: [], links: [] } { traceId: 'a1c7faafb2135a3d970d31093ee63437', parentId: 'fb4451520d0fa2b5', traceState: undefined, name: 'router - /', id: 'de567c024714ac71', kind: 0, timestamp: 1710156373206000, duration: 111.275, attributes: { 'http.route': '/', 'express.name': '/', 'express.type': 'router' }, status: { code: 0 }, events: [], links: [] } { traceId: 'a1c7faafb2135a3d970d31093ee63437', parentId: 'fb4451520d0fa2b5', traceState: undefined, name: 'request handler - /post/:id', id: 'ee140eee706875b2', kind: 0, timestamp: 1710156373208000, duration: 32.743, attributes: { 'http.route': '/post/:id', 'express.name': '/post/:id', 'express.type': 'request_handler' }, status: { code: 0 }, events: [], links: [] } { traceId: 'a1c7faafb2135a3d970d31093ee63437', parentId: undefined, traceState: undefined, name: 'GET /post/:id', id: 'fb4451520d0fa2b5', kind: 1, timestamp: 1710156373201000, duration: 12980.123, attributes: { 'http.url': 'http://localhost:3000/post/5', 'http.host': 'localhost:3000', 'net.host.name': 'localhost', 'http.method': 'GET', 'http.scheme': 'http', 'http.target': '/post/5', 'http.user_agent': 'insomnia/8.6.1', 'http.flavor': '1.1', 'net.transport': 'ip_tcp', 'net.host.ip': '::ffff:127.0.0.1', 'net.host.port': 3000, 'net.peer.ip': '::ffff:127.0.0.1', 'net.peer.port': 58665, 'http.status_code': 200, 'http.status_text': 'OK', 'http.route': '/post/:id' }, status: { code: 0 }, events: [], links: [] } ```

I expected it to show up as router - /post/:id but instead was just router - /

At first I naively expected something like "/*" for http.route (implying a name like router - /*), since the router is matching every route.

Now I'm thinking that doesn't make sense either, because http.route, as defined in the semantic conventions, only applies to a request and probably shouldn't describe how some framework internal routing is matching (even if it plays a role in the request route matching).

So I think your expectation fits better than mine. However, now I'm wondering if http.route should be included at all for router and middleware spans. They don't represent a request and don't have any other http span attributes. At least for people without Open Telemetry experience (like me) http.route could always be a bit confusing on these spans, because a router or middleware also has a route (path template) leading to it.

An alternative could be to use an attribute like express.route (or express.mount_path (router span) and similar), hopefully making it clearer what to expect.

Thinking about it this way, I even like the current naming of router spans being "router - \<Mount Path (Template)>". In the examples, "router - /" and "router - /post" are actually helpful. But maybe this is confusing for the people with more Open Telemetry experience (expecting http.route to be part of names, as it is for request spans).

I'm happy to provide a PR, which would address both issues (once I know which keys and values make sense).

neilfordyce commented 2 months ago

Thanks for looking into this. I've been facing similar issues fully populating http.route in Express applications with more complicated nested routers. I came across aspecto-io Express Instrumentation. I replaced @opentelemetry/instrumentation-express js-contrib package with Aspecto's opentelemetry-instrumentation-express locally and it seemed to set http.route with a more complete path template I was hoping for. Sadly it's marked as deprecated as of last week, but seems like a decent workaround or might be interesting to compare approaches.

JamieDanielson commented 2 months ago

However, now I'm wondering if http.route should be included at all for router and middleware spans.

@JohannesHuster Thank you for detailing your thoughts here! I can see how there are a few ways we could go about this. One thing I am checking out is how other languages may handle these types of middleware and router spans because as you mention, they're not the actual request spans so perhaps they shouldn't have any http attributes like this. There may still be good reasoning to use something like express.route, as you also point out and as it seems may be included in the aspecto instrumentation linked by @neilfordyce . We already have things like express.type and express.name in the instrumentation. I'll look further into this!

onurtemizkan commented 1 month ago

@JamieDanielson, I have been debugging this usage as it's also breaking the SDK in Sentry v8. I can try to wrap up a PR if you don't have an active work in progress.