Open jreadey opened 9 years ago
I'm tagging this as an "enhancement" rather than a bug since it was a known limitation of the design.
It may be worthwhile investigating using an alternative json parser such as: https://pypi.python.org/pypi/ijson/.
Would it make more sense to tackle this using a native-C implementation of the conversion tools?
Any work towards this?
Sort of... In HSDS we use what is basically the hdf5-json schema for metadata, but chunk data is stored as blobs. See: https://github.com/HDFGroup/hsds/blob/master/docs/design/obj_store_schema/obj_store_schema_v2.md for a description. This works pretty well - we've used it for "files" as large as 50 TB. "files' is in quotes since what you get at the end is a large collection of files in a tree structure.
This was done to support the HDF service, but the same approach could be used outside the server.
What type of problem are you looking to solve?
h5tojson.py and jsontoh5.py can't convert files whose size is comparable to the amount of physical memory on the machine the convertor is running on.