Open patrick91 opened 4 years ago
If anyone has the same issue, here's the workaround I'm using:
import tracer from 'dd-trace';
// from https://github.com/DataDog/dd-trace-js/blob/86c57e0eaba9c83f6067752f83af3d25ec578e3b/packages/dd-trace/src/plugins/util/web.js#L408
// we changed `req.headers['host']` with `req.host`
function extractURL(req) {
const headers = req.headers;
if (req.stream) {
return `${headers[HTTP2_HEADER_SCHEME]}://${headers[HTTP2_HEADER_AUTHORITY]}${headers[HTTP2_HEADER_PATH]}`;
} else {
const protocol = req.connection.encrypted ? 'https' : 'http';
return `${protocol}://${req.host}${req.originalUrl || req.url}`;
}
}
tracer.init();
tracer.use('express', {
hooks: {
request: (span, req, res) => {
const url = extractURL(req);
span.setTag('http.url', url.split('?')[0]);
},
},
});
export default tracer;
The reasoning was that web.js
is used by all web frameworks including Express. Since req.host
is set by Express, using it would only work for Express. I think we should instead replicate the Express logic to properly handle the different headers in every framework, assuming this is the expected behaviour.
The reasoning was that
web.js
is used by all web frameworks including Express. Sincereq.host
is set by Express, using it would only work for Express. I think we should instead replicate the Express logic to properly handle the different headers in every framework, assuming this is the expected behaviour.
That makes total sense. But I wonder about replicating what express does, as it is something that depends on a configuration. What about using req.host if it exists and falling back on the headers if not?
Not sure what other framework do 🤔
Do you think it would make sense to add a configuration on the tracer as well? It definitely makes sense to use req.host
in any case though. I didn't realize Express had a configuration for it which the tracer should definitely pick up.
Do you think it would make sense to add a configuration on the tracer as well? I'd try to find a good way to use what the framework provides us, so dd-trace-js doesn't do anything too complex
It definitely makes sense to use
req.host
in any case though. I didn't realize Express had a configuration for it which the tracer should definitely pick up.
I'm checking also other frameworks:
uses hostname
: https://www.fastify.io/docs/latest/Request/#request
also configurable to trust proxies: https://www.fastify.io/docs/latest/Server/#trustproxy
from the docs:
request.host
Get host (hostname:port) when present. Supports X-Forwarded-Host when app.proxy is true, otherwise Host is used.
request.hostname
Get hostname when present. Supports X-Forwarded-Host when app.proxy is true, otherwise Host is used.
If host is IPv6, Koa delegates parsing to WHATWG URL API, Note This may impact performance.
I wasn't really able to find much, but it seems they do things a bit differently: https://stackoverflow.com/questions/31840286/how-to-get-the-full-url-for-a-request-in-hapi
so maybe we should pass a function to fetch the host/url to the web util, so we can customise the behaviour per framework?
so maybe we should pass a function to fetch the host/url to the web util, so we can customise the behaviour per framework?
Right, it really looks like each one has their own config and their own way to support the header. Then I agree that the best approach would be by framework since it would mean we can support the configuration out of the box.
I can attempt a PR for express :)
Not sure where I could do this though, maybe I can pass a function in normalize config, here: https://github.com/DataDog/dd-trace-js/blob/8c5d352bcac169f4bda03830566ffbbd29456791/packages/datadog-plugin-express/src/index.js#L7
this function would be used to get the correct host. The other place I see is instrument, but doesn't look like it is the best place for something like that.
I would say the easiest is probably to add a setHost
on web.js
that plugins can use to set the actual host and that would default to the Host header (current behaviour).
@rochdev it is probably faster to ask you, where would I put the host? I think I need to call setHost before instrumenting the request, so I can't do req._datadog.host = xxx
, do you have any suggestion?
@patrick91 Can you clarify why you can't use req._datadog
? Where are you currently calling setHost
? Maybe the config can be stored somewhere so that setHost
can be called later?
if I understood correctly req._datadog
is set in .patch, which is called in instrument function, here's a very WIP PR I've done:
https://github.com/DataDog/dd-trace-js/pull/1082/files#diff-0b84b4276780e2d2c5b9e1c34dc86b78R57
I guess I could create _datadog when needed, but doesn't feel right
It's possible to call patch
directly before calling instrument
, even multiple times if needed. This was added explicitly for this use case of adding _datadog
before instrumenting.
@rochdev thanks! I'm having some troubles with req
not being the modified request from express, not sure what's happening, I'll need to spend a bit of time on it, as I'm unfamiliar with node and express, maybe I'll ask around for some help 😊
Hi, I was trying to improve our tracing to show the correct hostname, we run lambdas under API Gateway with a Cloudfront proxy in front of it, so the host in the header is not the right one, since the right one is in the
X-Forwarded-Host
header.You can allow express to trust the
X-Forwarded-Host
header, see: https://expressjs.com/en/guide/behind-proxies.htmlThis will set the correct host on
req.host
, but looks like this is not being used by dd-trace-js[1]. Is there any reason for that?[1] https://github.com/DataDog/dd-trace-js/blob/master/packages/dd-trace/src/plugins/util/web.js#L415