Closed vladfi1 closed 3 years ago
I think the error comes from these lines in file_opener function
if isinstance(f, (io.TextIOWrapper, io.BufferedWriter)):
filename, mode = f.name, f.mode
f.close()
mode = mode.replace('b', '')
h5f = h5.File(filename, mode)
instead of passing the file like object to h5py as is or at least specifiying driver as 'fileobj'
h5f = h5.File(f,mode,driver = 'fileobj')
it is tried to get the file name form the passed in TextIOWrapper or BufferedWrite (any other filelike objects are ignored) and its mode and than the file is closed and a new hdf5 file with specified file name and node is opened. Alternatively only an already opened h5py.File object or a plain file path string are accepted any thing else causes an exception. Possibly on finalization of next minor or major release somebody shall have a look to it h5py from 2.10 on definitely is capable of handling filelike objects as long as they provide read, seek, tell and write method. (see h5py.File)
Oh, I guess I missed this one.
I am not entirely sure why that does not work, as we do test for it, but I will have a look. However, as h5py was recently updated to 3.0.0, which brought a ton of changes (that are also incompatible with hickle), I am more planning on doing a pass over the entire package to account for that. I will add this issue to that list, but it may take a while before it will be fixed.
@1313e @telegraphic just in case it might be of any interest to you or even any help at all i wanted to let you know that: Beeing a bit boored while waiting for @telegraphic to decide upon pull request #138 i tried to do some proof of concept for handling file and file like objects as supported by h5py. The results of this trial and error can also be found in the detached concept_memp_compact_expand branch of my hickle fork.
Yes its very duck-type'ish python'ish but if you look at h5py.File its init method just checks when file or file like objects passed for existance of 'read' and 'try' attribute.
The same here, would be included in my finalize and cleanup pullrequest after #138, and upcomming for #139 and #145.
After looking into it, I realize that this is not an error.
hickle
can solely be used to dump to HDF5-files.
A BufferedWriter
is not an HDF5-file, so hickle
cannot dump to it.
Are there plans to support writing to in-memory bytes rather than files?
Not at the moment, no.
@vladfi1 just to chime in here: hickle
is indeed designed specifically for dumping to HDF5 files, and uses h5py
as its API -- which doesn't support BufferedWriter
. If you really wanted a HDF5 file in memory, you could try setting up a ramdisk? However I think there are probably better solutions out there for in-memory data storage...
@vladfi1 just to chime in here:
hickle
is indeed designed specifically for dumping to HDF5 files, and usesh5py
as its API -- which doesn't supportBufferedWriter
. If you really wanted a HDF5 file in memory, you could try setting up a ramdisk? However I think there are probably better solutions out there for in-memory data storage...
@telegraphic @1313e not so true, according to documentation for h5py 2.10 and onward they support any file like object which is capable of reading and writing binary data and which is seekable and io.BytesIO
exactly full fills that, one can find that example in h5py manual . Thus the questions is rather is it worth the efforts to add all the required checks whether passed in file-like object conforms to requirements of h5py or not. In case not remove support for file-like objects and Python file handles from hickle
and support only.
@vladfi1 why would you need the io,BufferedWriter
. io.BytesIO
is already a io,BufferedIOBase
type object (see Python IO manual) like io.BufferedWriter
and io.BufferedReader
are and thus is already buffered. So replace io.BufferedWriter
simply by h5py.File
to make your example work.
raw = io.BytesIO
writer = h5py.File(raw)
hickle.dump(obj,writer,mode='w')
and on read
reader = h5py.File(raw)
hickle.load(obj,reader,mode='r')
So you see no need for io,BufferedWriter
at all or in other words h5py.File
acts as wrapping writer and reader.
@hernot It is still true what we are saying. It does not matter if h5py allows writing to other filetypes, hickle does not support it.
Yes you are right, bad wording from my side. What i wanted to say, is if it does not support it it should not allow to pass file objects and file-like objects at all. As how it is done now is broken and against expectations when passing file objects, with the consequence that this will not stay the only issue related to strange or broken support of file and file like objects.###
fid = open('/tmp/somefile.h5','w+b')
writer = io.BufferedWriter(fid)
hickle.dump(obj,writer)
fid.flush()
fid.seek(0)
somesocket.write(fid.read())
But that does not work as hickle
will just takes the filename and closes the original file or file-like object and replaces the underlying file on disk with a completely new file with the same name and the default access rights files owned by the process running hickle
not the ones of the original file and also not necessarily with the same rights of the original file. This does not make sense at all to me . Why should i first open a file which is never used or even worse when reading for an already written hickle file the file is deleted. When i open a file beforehand i want hickle
to place the hdf5 file content exactly in that file and nothing else and the wrapping inside 'io.BufferedWriter' or 'io.TextIOWrapper' should not make any difference here. So either and that is what i meant take the decision to properly support file and file like-objects eg from hickle >= 5.0 on and take the efforts to fix it until then or decide not to support file and file-like objects at all beyond indirect support through passed in h5py.File objects. Than remove support completely only allowing filename strings and h5py.File objects to be passed.
I am trying to dump to in-memory bytes so that I can then compress these bytes with zlib before writing to disk.
The last line raises
AttributeError: '_io.BytesIO' object has no attribute 'name'
.