opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
460 stars 216 forks source link

Support compression on file import and export #812

Open matthiaskoenig opened 5 years ago

matthiaskoenig commented 5 years ago

In the discussion with @Midnighter it came up that there should be a generic support for compressed files for the various io modules. I.e. cobrapy should support the reading and writing of compressed files to the various formats (JSON, MAT, SBML, YAML). To avoid code duplication there should be a single implementation of the compression support.

Compression becomes important for the genome-scale models which are very large (uncompressed).

Just as a note: The new SBML parser supports reading compressed files from paths, but not yet from file handles. But support for compression on writing is missing.

Midnighter commented 5 years ago

Just as a note: The new SBML parser supports reading compressed files from paths, but not yet from file handles. But support for compression on writing is missing.

I suppose libsbml doesn't support writing to a Python file stream, i.e., it can only either take a filename or create a string?

matthiaskoenig commented 5 years ago

Unfortunately not. Libsbml only supports reading or writing SBML to strings or paths. So the support for file handles and file streams will be implemented in python (via temporary files and reading the SBML strings via read()).

akaviaLab commented 2 years ago

Is this still valid? I'm asking because cobra/sbml.py seems to work well with gz and bz2, both reading and writing

matthiaskoenig commented 2 years ago

I implemented the compression support via libsbml for SBML. But there is no compression support for the other input formats.

akaviaLab commented 2 years ago

Hi Would you like me to use python modules gzip and bz2 to implement reading/writing of compressed files for the other formats? It would be slightly different from libsbml, which does it internally, but then the other formats could use them as well. I'm not sure it makes sense for matlab, but I might implement it for completeness.

On Wed, Jul 13, 2022 at 3:36 AM Matthias König @.***> wrote:

I implemented the compression support via libsbml for SBML. But there is no compression support for the other input formats.

— Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/812#issuecomment-1182871303, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQYYZXUQGMQXNFO7GUA42LVTZWWFANCNFSM4G3YOIHA . You are receiving this because you commented.Message ID: @.***>

matthiaskoenig commented 2 years ago

Yes, this was basically the idea. Especially the JSON can be compressed very efficiently.

akaviaLab commented 2 years ago

Okay - if you can review and hopefully merge #1245, which allows I/O to use Paths, I can build upon that to get all formats to deal with compressed files. Probably a helper function/file in the io directory, and all the formats will call it.

On Wed, Jul 13, 2022 at 11:58 AM Matthias König @.***> wrote:

Yes, this was basically the idea. Especially the JSON can be compressed very efficiently.

— Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/812#issuecomment-1183401665, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQYYZTTKOPOJLVLI5G4323VT3RTPANCNFSM4G3YOIHA . You are receiving this because you commented.Message ID: @.***>