Azure / azure-functions-python-library

Azure Functions Python SDK
MIT License
151 stars 63 forks source link

Implement read1 method for InputStream #92

Closed hardikpnsp closed 1 year ago

hardikpnsp commented 3 years ago

Description

InputStream inherits from BufferedIOBase which defines a read1 method to return an arbitrary amount of bytes instead of everything till EOF. libraries like pandas provide functions to read from a io stream and they seem to be calling read1 method internally on the stream. They throw an error when InputStream is directly passed

Reproducible Example

import pandas as pd
from azure.functions.blob import InputStream

def error():
    i = InputStream(data=b'a,b,c,d\n1,2,3,4')
    # This throws read1() UnsupportedOperation exception
    df = pd.read_csv(i, sep=",")

def hack():
    i = InputStream(data=b'a,b,c,d\n1,2,3,4')

    def read1(self, size: int = -1) -> bytes:
        return self.read(size)

    setattr(InputStream, 'read1', read1)

    # This works because we hacked read1 method into InputStream
    with pd.read_csv(i, sep=",", chunksize=1) as reader:
        for chunk in reader:
            print(chunk)

if __name__ == "__main__":
    hack()
    error()

versions: python=3.7 pandas=1.2.4

Use cases

References

someone had implemented read1 in ABC of the python-worker repo in this PR. seems like details are lost in a forced push (This might be unrelated, not sure)