requests / toolbelt

A toolbelt of useful classes and functions to be used with python-requests
https://toolbelt.readthedocs.org
Other
998 stars 186 forks source link

Add a Stream class to upload a streamed response body #309

Open sathieu opened 3 years ago

sathieu commented 3 years ago

I want to do this:

        download_response = self._session.request(
            'GET',
            source_url,
            stream=True,
        )

        upload_response = self._session.request(
            'PUT',
            upload_url,
            data=Streamer(download_response),
        )

When the source has a content-length, this is easily done with:

class Stream(object):
    def __init__(self, response: Response) -> None:
        self._response = response

    def __len__(self) -> int:
        return int(self._response.headers.get('Content-Length'))

But when there is no Content-Length, it's really harder ... (we need to implement bool, len, iter, ...).

Please provide an easy to use Streamer.

See https://gitlab.com/gitlabracadabra/gitlabracadabra/-/issues/37

sathieu commented 3 years ago

Here is something that works:

class Stream(object):
    """Stream."""

    def __init__(self, response: Response, chunksize: int = 65536) -> None:
        """Initialize Stream.

        Args:
            response: Streamed response.
            chunksize: Chunk size (used when there is no Content-Length header).
        """
        self._response = response
        self._chunksize = chunksize

    def __bool__(self) -> bool:
        """Stream as boolean.

        Needed for Session.request() which uses: data=data or dict().
        (otherwise, would be considered False when length is 0).

        Returns:
            Always True.
        """
        return True

    def __len__(self) -> int:
        """Get stream length.

        Returns:
            The stream length. Zero if there is no Content-Length header.
        """
        return int(self._response.headers.get('Content-Length', '0'))

    def __iter__(self) -> Iterator[bytes]:
        """Get an iterator of chunks of body.

        Returns:
            A bytes iterator.
        """
        return self._response.raw.stream(self._chunksize)  # type: ignore

    def read(self, size: Optional[int] = -1) -> AnyStr:
        """Read stream.

        Args:
            size: Length to read.

        Returns:
            The read bytes/str.
        """
        return self._response.raw.read(size)  # type: ignore

Usage:

        source_url = 'http://httpbin.org/stream-bytes/200'
        destination_url = 'http://httpbin.org/put'

        session = Session()
        download_response = session.request(
            'GET',
            source_url,
            stream=True,
        )

        upload_response = session.request(
            'PUT',
            destination_url,
            data=Stream(download_response),
        )

I've implemented it in gitlabracadabra. If you need to re-use in toolbelt, just ask (license is LGPL but I can give you explicit permission to copy code from gitlabracadabra/packages/stream.py@9960c998c4dd99bf1714427ed7f6d1c6ad55ea51 under Apache2 license)