garnaat / placebo

Make boto3 calls that look real but have no effect.
Apache License 2.0
394 stars 28 forks source link

StreamingBody Issue in Record Mode #48

Closed fxfitz closed 7 years ago

fxfitz commented 8 years ago

Today I had the problem with the following code in record mode:

            stdout_key = self._s3.Object(bucket_name='somebucket',
                                         key='{}/stdout'.format(location))
            result_output = stdout_key.get()['Body'].read()

No matter what, result_output was always ''. After some digging, I was able to determine that the stream was already read.

(Pdb) test = stdout_key.get()['Body']
(Pdb) test._raw_stream.tell()
5

Important note: the tests run fine and work as expected when in playback mode, and the content of the StreamingBody is saved correctly to the cassette.

My assumption: Placebo is read()ing the StreamingBody while in record mode so that it can save it to the cassette.

Is there any way we can still present the results correctly? I'd like to avoid having my tests fail in record mode and then pass during playback mode. :-P

fxfitz commented 8 years ago

I believe the issue lies here: https://github.com/garnaat/placebo/blob/develop/placebo/serializer.py#L57

Since the StreamingBody is read() here, it's not able to present the body back to boto.

Any ideas on how we can fix this?

Miserlou commented 8 years ago

Also running into this issue.

fxfitz commented 8 years ago

@Miserlou I've been monkeypatching my tests with a patched version of serialize/deserialize for now. Hopefully this helps you in the short term.

@pytest.fixture(autouse=True)
def patch_placebo(monkeypatch):
    monkeypatch.setattr(placebo.pill, 'serialize', tests.util.serialize_patch)
    monkeypatch.setattr(placebo.pill, 'deserialize',
                        tests.util.deserialize_patch)

def deserialize_patch(obj):
    """Convert JSON dicts back into objects."""
    # Be careful of shallow copy here
    target = dict(obj)
    class_name = None
    if '__class__' in target:
        class_name = target.pop('__class__')
    # Use getattr(module, class_name) for custom types if needed
    if class_name == 'datetime':
        return datetime.datetime(**target)
    if class_name == 'StreamingBody':
        return BytesIO(target['body'])
    if class_name == 'CaseInsensitiveDict':
        return CaseInsensitiveDict(target['as_dict'])
    # Return unrecognized structures as-is
    return obj

def serialize_patch(obj):
    """Convert objects into JSON structures."""
    # Record class and module information for deserialization

    result = {'__class__': obj.__class__.__name__}
    try:
        result['__module__'] = obj.__module__
    except AttributeError:
        pass
    # Convert objects to dictionary representation based on type
    if isinstance(obj, datetime.datetime):
        result['year'] = obj.year
        result['month'] = obj.month
        result['day'] = obj.day
        result['hour'] = obj.hour
        result['minute'] = obj.minute
        result['second'] = obj.second
        result['microsecond'] = obj.microsecond
        return result
    if isinstance(obj, StreamingBody):
        original_text = obj.read()

        # We remove a BOM here if it exists so that it doesn't get reencoded
        # later on into a UTF-16 string, presumably by the json library
        result['body'] = original_text.decode('utf-8-sig')

        obj._raw_stream = BytesIO(original_text)
        obj._amount_read = 0
        return result
    if isinstance(obj, CaseInsensitiveDict):
        result['as_dict'] = dict(obj)
        return result
    raise TypeError('Type not serializable')
markfink commented 8 years ago

@fxfitz I ran into the same issue. your patch works for me. great job - thank you!

fxfitz commented 8 years ago

Awesome, @markfink! Thanks for the feedback! :-D