ropensci / vcr

Record HTTP calls and replay them
https://docs.ropensci.org/vcr
Other
77 stars 12 forks source link

filter_sensitive_data and secret in URL when URI is used for matching #221

Closed maelle closed 3 years ago

maelle commented 3 years ago

cc @bretsw

maelle commented 3 years ago

Problem: an API where the API key is passed as URL parameter. It seems to me that vcr doesn't transform the URI before trying to match it URI in cassettes.

So

maelle commented 3 years ago

this doesn't explain why the tests pass on my machine though?!

maelle commented 3 years ago

Example of a CI build whose artefacts contain a key (sharing now that the key has been made invalid) https://github.com/bretsw/tidytags/actions/runs/518185227

maelle commented 3 years ago

In the artefact testthat.Rout.fail

An HTTP request has been made that vcr does not know how to handle:
GET http://api.opencagedata.com/geocode/v1/json/?q=New%20York%2C%20NY&limit=1&no_annotations=1&no_dedupe=0&no_record=0&abbrv=0&add_request=1&key=%2208c2339f937d4c8e9e390853e7fb50e1%22
vcr is currently using the following cassette:
maelle commented 3 years ago

@sckott note that the problem does not happen when there is no environment variable set by GitHub Action https://github.com/bretsw/tidytags/actions/runs/521323642 (this is a fork)

sckott commented 3 years ago

The first thing that looks odd to me is that they exposed key has quotes around it, Why would it be in quotes?

maelle commented 3 years ago

yeah weird

sckott commented 3 years ago

@bretsw To deal with the quotes issue, did you store your opencage key with quotes around it in the Github secrets settings?

maelle commented 3 years ago

in the log there was no quote

image

bretsw commented 3 years ago

@bretsw To deal with the quotes issue, did you store your opencage key with quotes around it in the Github secrets settings?

I think this was likely the issue with the OpenCage key. I cleared out my GH secrets and the build seemed to work (at least for OpenCage). Still perhaps not great that a failed build exposed a secret, but helpful to narrow in on the problem

sckott commented 3 years ago

Sorry for delay on this, have this on the to do list for tomorrow

bretsw commented 3 years ago

No problem! Thanks for tackling

sckott commented 3 years ago

It seems to me that vcr doesn't transform the URI before trying to match it URI in cassettes

I don't think this is a bug in vcr's request matching given that this only happened in one test - i'll keep looking of course, but I don't think it's a problem in vcr's matching

sckott commented 3 years ago

I've been trying to replicate this problem with this dummy R package https://github.com/sckott/fizball - but have not been able to make a key be hidden in the logs but not hidden in the artifacts when there's a test failure

any ideas for what to try?

maelle commented 3 years ago

@sckott the tests/.Rout.fail contains the key https://github.com/maelle/fizball/actions/runs/554705151 :tada:

I saved it with quotes see below

image

maelle commented 3 years ago

The whole recipe for failure is

sckott commented 3 years ago

Thanks. I guess one approach is to strip any single or double quotes around a secret before using it - BUT it's possible that a user could have quotes in a secret; seems unlikely a secret would have quotes at beginning and end of the string though.

maelle commented 3 years ago

one thing I don't understand is why vcr can't recognize the URL i.e. why it can't match when there are quotes? something with the string manipulation doesn't work in that case?

Your approach sounds good (removing when there are quotes at the beginning and at the end), but it'd be good to have a warning about the quotes being stripped, with some option to turn this warning off (to be used in the very unlikely case that such quotes are needed).

I was also thinking that folks might fix their secret on their own once they try doing real requests (turn vcr off) in a workflow, because their real requests will fail.

Also, I need to add screenshots to the book so oprened an issue there: https://github.com/ropensci-books/http-testing/issues/82

sckott commented 3 years ago

why it can't match when there are quotes?

It's identical() internally I think, so if if url doesn't match exactly, one has quotes and one does not, then there's no match.

Agree if we strip leading and trailing single/double quotes, then throw a warning about it

maelle commented 3 years ago

yes sounds good

sckott commented 3 years ago

okay, @bretsw @maelle just pushed a fix. Can you try again if you get a chance?

maelle commented 3 years ago

It seems to have worked https://github.com/maelle/fizball/actions/runs/631706778

sckott commented 3 years ago

thanks for testing