rickardp / splitstream

Continuous object splitter for C and Python
Apache License 2.0
44 stars 9 forks source link

Does streaming not work with compressed files? #19

Open synthesizerpatel opened 6 months ago

synthesizerpatel commented 6 months ago

First off, wonderful module, thank you for your work!

I'm wondering if the C code nature of this module makes it too low level to work with compressed stream objects?

With the latest version of your code I try

def parser(file):
    with bz2.open(file, 'rb') as bzfh:
        for jstonstr in splitfile(bzfh, format="json"):
            print(jsonstr)
            yield json.loads(jsonstr)

When I run it, I would expect jsonstr to be a string of one complete JSON statement {...} , but it looks like it the compressed stream of data?

I'm was hoping to open bz2 compressed compact JSON files with many records back to back {..}{..}{..}{..} and process them.