kevin1024 / vcrpy

Automatically mock your HTTP interactions to simplify and speed up testing
MIT License
2.69k stars 387 forks source link

CannotOverwriteExistingCassetteException with no matchers failed #533

Open BenMatase opened 4 years ago

BenMatase commented 4 years ago

I'm trying to remove access tokens from the response body of some http transactions that are being recorded so that I can check the sanitized cassettes into source control so I've added a before_record_request to the vcr.

import json

def scrub_access_token(response):
    body = response.get("body", {}).get("string")
    if body is None:
        return response
    try:
        body = json.loads(body)
    except Exception:
        return response

    if "access_token" not in body:
        return response

    body["access_token"] = "REDACTED"
    new_body = json.dumps(body)
    response["body"]["string"] = new_body
    response["headers"]["Content-Length"] = len(new_body)
    return response

@pytest.mark.vcr(before_record_response=scrub_access_token)
def test_foo:
    ...

It runs successfully the first run and correctly removes the field from the body, but then on subsequent runs, I get an error when requesting that request.

E               vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/home/bmatase/cassettes/test_become_active.yaml') in your current record mode ('none').
E               No match for the request (<Request (GET) http://169.254.169.254/metadata/identity/oauth2/token?resource=https%3A%2F%2Fmanagement.core.windows.net%2F&api-version=2018-02-01>) was found.
E               Found 1 similar requests with 0 different matcher(s) :
E               
E               1 - (<Request (GET) http://169.254.169.254/metadata/identity/oauth2/token?resource=https%3A%2F%2Fmanagement.core.windows.net%2F&api-version=2018-02-01>).
E               Matchers succeeded : ['method', 'scheme', 'host', 'port', 'path', 'query']
E               Matchers failed : 

I don't understand why the two requests aren't matching if all of the matchers matched. It seems odd that the requests aren't matching because I'm modifying the reponse. I originally thought that it was because of the content length not matching, but it still occurs after that addition.

I don't think this is a duplicate of #516 since there aren't two identical requests in the error message.

tylergannon commented 3 years ago

I have the same issue except that it is intermittent. From one invocation to the next, without overwriting the same cassette file, I can't predict whether I will get the error or not.

E               vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('fixtures/vcr_cassettes/import_fixtures.yaml') in your current record mode ('once').
E               No match for the request (<Request (GET) https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/Cc1ccccc1/json>) was found.
E               Found 1 similar requests with 0 different matcher(s) :
E
E               1 - (<Request (GET) https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/Cc1ccccc1/json>).
E               Matchers succeeded : ['method', 'scheme', 'host', 'port', 'path', 'query']
E               Matchers failed :

What it looks like is that it's precisely the same request.

tylergannon commented 3 years ago

Is it possible that this is a case where there is incomplete feedback in the case of reuse of a request inside a cassette? I circumvented my problem by enabling allow_playback_repeats. Given that, my guess is that perhaps I'm inadvertently requesting the same URL twice even though it was only requested once during recording. It's possible that some additional feedback might help users to better understand this functionality.

vladdyk commented 3 years ago

I have the exact same issue as tylergannon

rdemetrescu commented 3 years ago

I'm having this issue as well.

It seems to be a problem here:

https://github.com/kevin1024/vcrpy/blob/c79a06f639dd628536c9868044e78df1012985f9/vcr/stubs/__init__.py#L230-L233

filter_request will call our own before_record_response where we could avoid modifying the response if we knew the cassette is write protected. But we don't get any info about cassette at all.

vxfield commented 3 years ago

+1, but in my scenario, I indeed need to make multiple identical requests. As tylergannon mentioned, I was able to mitigate the issue with "cassette.allow_playback_repeats = True`.

will-misslin commented 3 years ago

+1, I am having this issue as well, but I am using the pytest-vcr, I can't pass allow_playback_repeats through to vcrpy. Will watch this thread.

uncreative commented 3 years ago

I'm having the same issue, and I see it successfully finding the response the first time, unfortunately in python3.9/site-packages/urllib3/connectionpool.py I see it trying again:

                # Python 2.7, use buffering of HTTP responses
                httplib_response = conn.getresponse(buffering=True) # <--- we successfully find the request, but with buffering we fail
            except TypeError:
                # Python 3
                try:
                    httplib_response = conn.getresponse() ### <--- Here we retry and can't find a matching request
                except BaseException as e:
                    # Remove the TypeError from the exception chain in
                    # Python 3 (including for exceptions like SystemExit).
                    # Otherwise it looks like a bug in the code.
                    six.raise_from(e, None)

So because buffering doesn't work, it tries again, but now that response that request has already been marked as played...

MauriceBenink commented 3 years ago

I was also ran into this issue today. seems like allow_playback_repeats=true can fix it if you dont have situations where you have identical requests with different responses

However i use pytest-vcr. this means i cannot set allow_playback_repeats. and we also have situations where we have to send identical requests which return different responses. But we also have situations that do the same request multiple times but give the same response.

For this i wrote a hacky fix/patch to fix it for me.

from vcr.cassette import Cassette, CassetteContextDecorator
from vcr.errors import UnhandledHTTPRequestError
from vcr.matchers import requests_match

class VCRRepeatPlayback:
    # This class is a hacky way to turn a dict into a class
    def __init__(self, **kwargs):
        for k,v in kwargs.items():
            setattr(self, k, v)

class PatchedCassette(Cassette):
    """
    Array which contains all the matchers which got configured, if these match then
    allow this specific request to be replayed
    """
   allow_playback_repeats_matches = [
        VCRRepeatPlayback(method='GET', path='/foo/bar', query=[], any_other_matcher_you_configured='foo'),
    ]

    def __init__(self, *args, playback_repeats_on_match=None, **kwargs):
        self.playback_repeats_on_match = playback_repeats_on_match or dict()
        super().__init__(*args, **kwargs)

    def _load(self):
        super()._load()
        self._populate_playback_repeats_on_match()

   def _populate_playback_repeats_on_match(self):
        """
        Initial population for the cassette that gets loaded.
        Will check every cassette entry request, if it matches with any self.allow_playback_repeats_matches
        Then it will be treated as if self.allow_playback_repeats=True for this request only
        :return:
        """

        # Prevent executing this multiple times
        if len(self.playback_repeats_on_match) > 0:
            return

        for allow_playback_match in self.allow_playback_repeats_matches:
            for index, (stored_request, response) in enumerate(self.data):
                # Prevent overwriting entry which was already matched before
                if self.playback_repeats_on_match.get(index, False):
                    continue
                # This check is very hacky, requests_match techinally expects a request object
                # However we just feed it a class which copies its attributes from a dict
                self.playback_repeats_on_match[index] = requests_match(
                    allow_playback_match, stored_request, self._match_on
                )

    def play_response(self, request):
        """
        Get the response corresponding to a request, but only if it
        hasn't been played back before, and mark it as played
        """
        for index, response in self._responses(request):
            # Added allow_playback will replay if it is marked as allowed to replay
            if self.play_counts[index] == 0 or self.allow_playback_repeats or self.playback_repeats_on_match[index]:
                self.play_counts[index] += 1
                return response
        # The cassette doesn't contain the request asked for.
        raise UnhandledHTTPRequestError(
            "The cassette (%r) doesn't contain the request (%r) asked for" % (self._path, request)
        )

    def __contains__(self, request):
        """Return whether or not a request has been stored"""
        for index, response in self._responses(request):
            # Added allow_playback will replay if it is marked as allowed to replay
            if self.play_counts[index] == 0 or self.allow_playback_repeats or self.playback_repeats_on_match[index]:
                return True
        return False

# This changes which class gets used as the Cassette class. 
vcr_allow_playback_repeats_class = PatchedCassette

def apply_patch():
    """
    patch methods/functions where Cassette class gets applied by vcr.
    together with the new methods which replace the methods which get patched
    :return: 
    """
    @classmethod
    def patch_replace_class_use(cls, **kwargs):
        return CassetteContextDecorator.from_args(vcr_allow_playback_repeats_class or cls, **kwargs)

    @classmethod
    def patch_replace_class_use_arg_getter(cls, arg_getter):
        return CassetteContextDecorator(vcr_allow_playback_repeats_class or cls, arg_getter)

    Cassette.use = patch_replace_class_use
    Cassette.use_arg_getter = patch_replace_class_use_arg_getter

Cant guarantee it fixes it and might be version specific (we use 4.1.1) However this did fix it for me.

nachocho commented 3 years ago

I was having the exact same problem, but I am using vcrpy-unittest, which is the recommended library for integration with unittest: https://vcrpy.readthedocs.io/en/latest/usage.html#unittest-integration.

I was able to fix it by setting allow_playback_repeats = True as suggested by @tylergannon (thanks for the workaround!). You only need to override the setUp method in order to access the cassette.

def setUp(self):
    super(MyTestClass, self).setUp()
    # Workaround for vcrpy issue #533
    self.cassette.allow_playback_repeats = True
Dantemss commented 3 years ago

The trick for me to avoid this error was to check the response body and not write it if it had already been modified, even if there shouldn't be any change (django/unittest with a JSON response):

import json

from vcr_unittest import VCRMixin
from django.test import TestCase

def before_record_response(response):
    try:
        response_json = json.loads(response['body']['string'])
    except json.decoder.JSONDecodeError:
        pass
    else:
        try:
            if 'access_token' in response_json and response_json['access_token'] != 'REDACTED':
                response_json['access_token'] = 'REDACTED'
                response['body']['string'] = json.dumps(response_json)
        except TypeError:
            pass

    return response

class MyVCRTestCase(VCRMixin, TestCase):

    def _get_vcr(self, **kwargs):
        kw = {'before_record_response': before_record_response, 'decode_compressed_response': True}
        kw.update(kwargs)
        return super()._get_vcr(**kw)

    def _get_vcr_kwargs(self, **kwargs):
        kw = {'filter_headers': ['Authorization'], 'filter_query_parameters': ['access_token']}
        kw.update(kwargs)
        return super()._get_vcr_kwargs(**kw)

EDIT: Only works when editing responses, not requests

Mossaka commented 3 years ago

I tried to reproduce this error. Here are the steps I performed.

  1. I run the following test
    
    import vcr
    import requests

@vcr.use_cassette() def test_iana(): requests.get('http://www.iana.org/domains/reserved') requests.get('http://www.iana.org/domains/reserved')

2. In the casssette file `test_iana`, I deleted one request. 

3. I then run the test again, here is the error message I got:

E vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/home/mossaka/developer/azureml-v2/sdk-cli-v2/jiazho_playground/test_iana') in your current record mode (<RecordMode.ONCE: 'once'>). E No match for the request (<Request (GET) http://www.iana.org/domains/reserved>) was found. E Found 1 similar requests with 0 different matcher(s) : E
E 1 - (<Request (GET) http://www.iana.org/domains/reserved>). E Matchers succeeded : ['method', 'scheme', 'host', 'port', 'path', 'query'] E Matchers failed :

iloveitaly commented 3 years ago

This PR combined with the following test/conftest.py file resolved the issue for me:

from vcr.persisters.deduplicated_filesystem import DeduplicatedFilesystemPersister

def pytest_recording_configure(config, vcr):
  vcr.register_persister(DeduplicatedFilesystemPersister)
Alexander-Serov commented 2 years ago

Indeed, allow_playback_repeats=True solves the issue. I guess we could just modify the error message to say that there is a matching request, but it has already been used and that this option should be provided to allow reusing it.

jriggins commented 2 years ago

👋🏾 I'm coming back to Pythonland after being out for a number of years and still trying to get my bearings. I also came across this with code very similar to the original poster's. I found this to actually work for me after modifying the response string. I have not looked into this enough to be able to tell you why though 😅 .

response['body']['string'] = bytes(json.dumps(json_body), 'utf8')
salomvary commented 1 year ago

Hey folks!

Found this issue after having spent half an hour trying to figure out the mysterious case with "no matchers failed". Turns out that indeed there was an unwanted repeat-request in my code.

Having a more helpful failure message would have saved me a bit of head-scratching. Would it make sense to change the message that prints out the fact that the request was repeated and even suggests turning on allow_playback_repeats?

dazza-codes commented 1 year ago

For pytest-vcr users, a conftest.py can contain some setup fixtures

The new_episodes record-mode solved this issue for me.

For example:

@pytest.fixture(scope="module")
def vcr_config():
    # For any live API requests, do not record the API-token
    # See https://vcrpy.readthedocs.io/en/latest/advanced.html
    return {
        "record_mode": "new_episodes",
        "filter_query_parameters": [("key", "APIKeyXXX")],
        "filter_headers": [("x-api-key", "X-API-KEY-XXX")]
    }

@pytest.fixture(scope='module')
def vcr(vcr):
    vcr.register_matcher('my_matcher', my_matcher)
    vcr.match_on = ['my_matcher']  # This can also go into vcr_config or marker kwargs
    return vcr
klarich commented 1 year ago

I am having the same issue and it seems to be intermittent. I have also tried setting allow_playback_repeats to true, which does not work for me, and setting RecordMode to NEW_EPISODES (which results in the test attempting to make the API call again, rather than using the cassette).

My workaround is to mock the API call that needs to be called twice in the test and was creating the issue.

prettyirrelevant commented 1 year ago

+1, I am having this issue as well, but I am using the pytest-vcr, I can't pass allow_playback_repeats through to vcrpy. Will watch this thread.

Might not be helpful to you anymore but using pytest-vcr, you can pass allow_playback_repeats like this

@pytest.mark.vcr(allow_playback_repeats=True)
def test_func() -> None:
    ...
paolorechia commented 7 months ago

For pytest-vcr users, a conftest.py can contain some setup fixtures

* https://pytest-vcr.readthedocs.io/en/latest/configuration/

The new_episodes record-mode solved this issue for me.

For example:

@pytest.fixture(scope="module")
def vcr_config():
    # For any live API requests, do not record the API-token
    # See https://vcrpy.readthedocs.io/en/latest/advanced.html
    return {
        "record_mode": "new_episodes",
        "filter_query_parameters": [("key", "APIKeyXXX")],
        "filter_headers": [("x-api-key", "X-API-KEY-XXX")]
    }

@pytest.fixture(scope='module')
def vcr(vcr):
    vcr.register_matcher('my_matcher', my_matcher)
    vcr.match_on = ['my_matcher']  # This can also go into vcr_config or marker kwargs
    return vcr

Thanks for sharing, this worked for me. EDIT: I spoke too soon. This causes the API to be fired again, as noted by @klarich. The suggestion from @prettyirrelevant sadly made no difference for me.