Open AE777F63 opened 5 years ago
Moved the apm repo as this seems like something agents should align on.
It is a common approach to pass things like "key" or "access_token" via query.
I would strongly advise not to put in sensitive information in query parameters. See also this OWASP article: https://www.owasp.org/index.php/Information_exposure_through_query_strings_in_url.
Given that there's a workaround and that it's bad practice to put sensitive information in the URL, I'm not sure we should invest time in each of our agents to support query parameter sanitization. Before we can sanitize them, we also have to parse them first which causes additional overhead in some languages/frameworks.
Another route we could go (@SylvainJuge mentioned that in a conversation I had with him) is to put some of the more expensive sanitization logic in the APM server. It could also search for patterns like social security numbers and credit card numbers which might be a bit too heavy to do on the agent itself. It could then also check for sensitive information in query parameters.
I would strongly advise not to put in sensitive information in query parameters
While I totally agree with that, in my case that's not something I have control over, and in my experience this approach is very common despite being sub-optimal in terms of security. But I see why you may not want to include that functionality as a part of the agent.
Before we can sanitize them, we also have to parse them first which causes additional overhead in some languages/frameworks.
This kind of filtering doesn't really have to be enabled by default, though.
Can this not be done via apm.addSpanFilter()
added in https://github.com/elastic/apm-agent-nodejs/pull/579 ?
What I do think would make sense though is having the ability to completely stop collecting the query parameter. But sanitizing specific query parameters would come with a performance hit.
Using custom filters as Trent suggested is a good option.
Another option is using an Elasticsearch ingest node processor. But that means that the sensitive information has already traveled over the network.
To avoid that, you could install an APM Server on the same host where your application is deployed and use processors to strip sensitive information. This will both avoid overhead in the application and avoid the sensitive data to travel over the network while providing a solution that is language agnostic and doesn't require a new deployment of your application.
+1 on adding ability to stop collecting query parameter, that should probably be disabled by default and when enabling it it would be clearly written in doc (or even in Kibana config) that user should ensure that no sensitive parameter is being passed through URL. If they aren't sure, they should not set it.
@trentm not all agents have the concept of agent-level filtering (for example Java does not have it).
This would be very useful. We have a need to filter GDPR related information away from query parameters and as such either completely disabling collection of query parameters or filtering them would be really good.
https://www.elastic.co/guide/en/apm/agent/java/current/config-core.html#config-sanitize-field-names
Is this not "it" ?
I've tried setting that parameter to the following, url.full
, url.query
url*
*url*
. It did not have the desired outcome. I could see that the config was accepted because the previously redacted metadata was now visible. My current config is below to keep the defaults in place but still no reaction of url fields. Any thoughts or suggestions on that?
ELASTIC_APM_SANITIZE_FIELD_NAMES
password, passwd, pwd, secret, *key, *token*, *session*, *credit*, *card*, authorization, set-cookie, *url*
+1 @SylvainJuge what is the status of this issue ?? it is aleady open for more than 2 years and half!! Should be solved or closed no ??
Is your feature request related to a problem? Please describe. The agent has an option to mask sensitive data from headers (filterHttpHeaders); there is no such option for query parameters. It is a common approach to pass things like "key" or "access_token" via query.
Describe the solution you'd like The agent should provide an option for masking specified query parameters.
Describe alternatives you've considered Writing a custom filter:
This, however, needs to be duplicated in every service, requires the URL to be parsed again, and is not very robust. It would be nice to have this functionality provided by the agent directly.