Gzip Content-Encoding workaround

hroncok commented 7 years ago

Hi, @MarekSuchanek was recently hit by the fact that some content is recorded Gziped and base64 encoded. It is hard to parse the response in a hook to hide sensitive data, if the sensitive data is hidden in a binary blob.

In the docs, it says:

There is, at the present moment, no way to configure this so that this does not happen and because of the way that Betamax works, you can not remove the Content-Encoding header to prevent this from happening.

However (with pytest fixture) simply doing:

betemax_session.headers.update({'accept-encoding': 'identity'})

Will bypass the problem.

I was about to propose a PR to the docs, but I better discuss it first with you before making a patch. The text at the end of the section might go like this:

You may, however, modify the request headers to indicate that you don't accept gzipped content:

session = requests.Session()
session.headers.update({'accept-encoding': 'identity'})
recorder = betamax.Betamax(session, ...)
...

Comments?

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/38829043-gzip-content-encoding-workaround?utm_campaign=plugin&utm_content=tracker%2F198445&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F198445&utm_medium=issues&utm_source=github).

smallnamespace commented 7 years ago

Can confirm, this is a useful trick for having reader cassettes.

I suppose the risk is that the endpoint ignores or rejects the identity set-encoding header?

hroncok commented 7 years ago

Well I can mention that risk as well.

Be aware that some endpoints may ignore or reject the identity set-encoding header.

betamaxpy / betamax

Gzip Content-Encoding workaround #124