Open andytwigg opened 7 years ago
Thank you for the feature request and the suggestion. The way paratext
is architected, using a Python file handle would require random access on the file. This is not easily achievable with the Lempel-Ziv algorithm on which gzip
is based -- some files use a fixed dictionary in the header, but this is not true of all files. One would need to do a first sequential pass on the file to build the dictionary at different chunk start points. Then, the threads are spawned and start decompressing their respective chunks using each's respective reconstructed dictionary. We would welcome this contribution if someone wants to take a crack at it!
would be nice to add support for opening .gz files Ideally we could pass a file handle, eg
It seems like the file handle is opened by the C code, so perhaps this is not practical, and easier to add gzip reading support directly to the C code?