Open bartoszmajsak opened 6 years ago
My idea to fix this would be:
properties
) with pair id=value
where id must start with #
(to identify it is an element that must be replaced) and the value the real value. For example #123=56543fd
As an example:
#1234=2123123adaczxcasdaq2231231223
"headers" : {
"Authorization" : [ "token #1234" ]
}
And then run hoverctl middleware --binary python --script secrets.py
setting an environment variable the key to decrypt the properties file (if we think this should be encrypted)
Thanks for your input @lordofthejars. Does it mean that for each captured interaction I would need to go and replace the corresponding key-value with proper placeholder by hand? Maybe we can make it a bit easier?
Do you have a PoC middleware which we can play around with?
No PoC yet.
If we know beforehand which fields we want to encrypt we can automate it. I think that for the purpose we want to use I'll restrict to github token header.
At this point when we run in capture mode we can autogenerate everything.
In next version, we can allow the user to set which fields want to encrypt by using a configuration file.
I was thinking about a slightly simpler approach. We could use jsonpath to define which elements we would like to encrypt and a secret to be used for this encryption. For example:
File with a definition what to mask (simple list):
headers.authorization
after applying encryption (with a key being loaded from env variable or through part of middleware cli as a flag if possible):
"headers" : {
"Authorization" : [ "{öÆÀêáDwæt÷hºï"vsvçÿfóº3ÓEyѯæ;·³÷}Â2JŽGi/VAý" ]
}
Working in this approach then.
@lordofthejars can you update on our latest conclusion?
The problem is that Hoverfly middleware does not store simulation in modify
mode, and we need modify
to be able to mask/unmask requests/responses and also we need to store simulations to not having to be online all the time. The solution purposes by Hoverfly guys is to create two Hoverfly proxies one in Modify and another in simulating and redirect all traffic between these two proxies. I don't like so much this solution since we are complicating so much the test for a simple use case. Hoverfly guys then told me that they are going to work on this, but when they are going to work, this is not known yet.
I'm asking myself why do we need the token in the simulation config? Correct me if I'm wrong - the request in simulation config is just for matching the request-response pair - to know which response should be returned. If I'm correct, then why we cannot have something like:
"headers" : {
"Authorization" : ".+"
}
which means that the value can contain some regex (or maybe just "*"
or "*****"
) saying that something should be set for authorization purposes, but not saying what exactly. This should work in case of simulation mode. In case of the capture mode, then I agree with Bartosz that it could be configured (using JSONpath) which parameters should be hidden and not stored.
But I'm maybe missing something here...
Thanks for the research in this area, let's park it for now as we found a way of using globMatch to hide tokens (but not to re-use at this point)
@MatousJobanek I think this is not that much of a killer feature to have. If you anyway use the original request to call the living service and stored request is only used to resolved stored response(e.g. to do the delta on it) I don't think we gain much from using it. Only as a precaution to not leak accidentally recorded things.
I would opt for parking it.
Hi, Hoverfly peeps here. We are looking at enabling lifecycle hooks which might enable what you want without us having to implement data masking in Hoverfly. However, it seems like you don't have this as a high priority and we don't have other use-cases at the moment.
@JohnFDavenport yes now it is parked, but we might need in future. Thanks for the update.
Some of the information stored in the service virtualization files should not be exposed in plain text. Investigate how in Hoverfly one can mask it for tests, yet still, share those simulation files e.g. GitHub.
Hint: might be possible by using middleware.
For example, request in the simulation can contain an actual token to GH if we want to virtualize their API. This should not be shared in GH (even though when you do, GH is smart enough to detect this token and revoke it, so it is no longer valid).