Closed flomlo closed 1 year ago
Hi flomlo,
Sorry for our late response. We have implemented options to compress the network h5 files and spikes h5 files (defaulting to gzip level 4). If you pull the most recent code from the 'develop' branch, it'll be available.
For a simple network files that does not contain individual weights, we do indeed see factor of 5-6 compression, and in our environment, it does not seem to impact the execution time.
Thank you for a great suggestion.
Oh lovely! I'll give it a try soonishly and will report back / reopen the issue if there is an issue (which I think is quite unlikely).
Thanks for implementing it - it will save approx a terrabyte on our side :)
Yes. Please let us know if there are any issues. Glad to hear that it'll be helpful. We appreciate it. By the way, if you already have many network files, h5repack can do compression of the existing h5 files, and the compressed files can be directly used for simulation (even with an old BMTK) as long as they are compressed with gzip.
I'll close this issue for now, but feel free to reopen if there is more to discuss.
Hi,
the saved edges produced by
save_edges
tend to consume quite a bit of memory. As an example, the mouse_v1 reconstruction with afraction=0.50
parameter is 742MB. This quickly becomes a problem (or at least a nuisance) when analyzing bigger networks or a few of them.This could be easily reduced by a factor of ~10 by enabling gzip-compression on the datasets inside of the hdf5-file. As gzip comes with hdf5, this does not introduce an additional requirement. The hdf5-implementations known to me (for Rust and for python) accept gzip-compressed datasets without any further adaptations to the code.
Are there any principal reasons against using gzip-compressed datasets in the .h5 file?
If not, I'ld volunteer to supply a patch (once I've figured out what to modify. Where the fuck does
_save_edges
in https://github.com/AllenInstitute/bmtk/blob/2078a4134dba74a89bdb4edc6cf224a65290d782/bmtk/builder/network_adaptors/network.py#L658 lead to?).