mxmlnkn / rapidgzip

Gzip Decompression and Random Access for Modern Multi-Core Machines
Apache License 2.0
345 stars 7 forks source link

std::ifstream interface #16

Closed Vadiml1024 closed 1 year ago

Vadiml1024 commented 1 year ago

It would be nice to have std::ifstream like interface to .gz and .bz files...

mxmlnkn commented 1 year ago

I think this could be done similarly to zfstream and might be suited for a community contribution because it should not require modifying any internal code, simply using the read/seek methods of ParallelGzipReader should theoretically be sufficient.

In the short term, I probably will be busy finishing two other features and preparing for the presentation of pragzip at HPDC '23, for which I have uploaded the author's version here.

Vadiml1024 commented 1 year ago

Great paper... Remark: I would rephrase: Non-sequential access patterns are supported performantly as: Non-sequential access patterns are supported efficiently

mxmlnkn commented 1 year ago

Thanks for the feedback and sorry for getting off topic.

I have implemented this feature request in this commit: https://github.com/mxmlnkn/indexed_bzip2/commit/835293d8c935994ca144a827c4b3a5cd4650f336

The tests have comments at places where the std::istream interface is subtly different from the FileReader interface. For example, FileReader::seek automatically clears the EOF flag while this is not the case for std::istream. And even when EOF has been encountered FileReader::tell returns the file size, while std::istream returns -1. These are things I intentionally changed for the FileReader because I don't like the istream interface in these regards.

mxmlnkn commented 1 year ago

Feel free to reopen the issue if there are problems. I have closed it because it has been merged.