Closed d-m closed 8 years ago
I pushed a fix on the develop
branch. If you can, could you verify that it is fixed? Thanks.
Thanks! This fixed things for my purposes. There is still an edge case if you define the GzipFile
object with a name like so:
...
file_object = gzip.GzipFile('test', fileobj=byte_stream)
...
If you name the file, you end up with:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-19-f0659c79356d> in <module>()
----> 1 rec.content_block.payload.get_file().read() == warc_record.content_block.payload.get_file().read()
/usr/local/lib/python3.5/site-packages/warcat/model/binary.py in get_file(self, safe, spool_size)
124 gzip.GzipFile(self.filename))
125 else:
--> 126 file_obj = open(self.filename, 'rb')
127
128 util.file_cache.put(self.filename, file_obj)
FileNotFoundError: [Errno 2] No such file or directory: 'test'
Looks like this can be fixed by swapping this if/else statement or by putting the in memory file in the cache.
Ok, thanks. I'm going to put that edge case as a separate issue.
The following:
results in an
AttributeError
inwarcat.model.binary.BinaryFileRef
:The same error also occurs with the
Payload.get_file
method. This seems to be because theBinaryBlock
andBlockWithPayload
classes'load
method passes the file object's name directly toset_file
on lines 40, 83, and 96 of warcat/model/block.py; changing these lines to pass in the file object itself instead of its name seems to work.