dsoprea / PyEasyArchive

A very intuitive and useful adapter to libarchive for universal archive access.
MIT License
96 stars 33 forks source link

Read failed (archive_read_data_block): (-30) while reading large archive by blocks #25

Open user2589 opened 7 years ago

user2589 commented 7 years ago

Download a large archive. Archive used in this example is a .7z archive from StackExchange dataset. It is 10.8Gb compressed and contains only one 55Gb file.

Read the file by blocks:

>>> import libarchive.public as libarchive
>>> with libarchive.file_reader('../dataset/stackoverflow.com-Posts.7z') as archive:
...     for entry in archive:
...         if str(entry) != 'Posts.xml':
...             continue
...         for block in entry.get_blocks():
...             pass
... 
Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/usr/local/lib/python2.7/dist-packages/libarchive/adapters/archive_read.py", line 219, in get_blocks
    for block in _read_by_block(self.reader_res):
  File "/usr/local/lib/python2.7/dist-packages/libarchive/adapters/archive_read.py", line 210, in _read_by_block
    (r,))
ValueError: Read failed (archive_read_data_block): (-30)
user2589 commented 7 years ago

It looks to be a libarchive issue, so I resubmitted this bug in their repository https://github.com/libarchive/libarchive/issues/913 . I guess this one can be closed