SBRG / bigg_models

The BiGG Models website server
http://bigg.ucsd.edu
Other
80 stars 18 forks source link

zip for compressed sbml format #107

Closed aebrahim closed 9 years ago

aebrahim commented 9 years ago

Hey guys,

I'm just curious why zip was chosen as the compression format instead of something gz?

The reason I bring that is is that as of opencobra/cobrapy@128be0c cobrapy can read and write directly with .xml.gz and .xml.bz2, but I'm not sure if I can implement this with zip easily (because a zip file can have an arbitrary internal structure).

I think MATLAB also has native gzip support as well, so this could likely be implemented in the cobra toolbox as well.

aebrahim commented 9 years ago

Oh, I forgot to mention... this is for SBML 3 + fbc2

draeger commented 9 years ago

Hi,

No problem, I can change ModelPolisher to write those other compression formats. ZIP was chosen because it it is probably most popular, i.e., the "default" compression method that people would use, I think.

Please let me know if you want me to make this change.

Cheers Andreas

aebrahim commented 9 years ago

If you want to keep zip files, that's no concern. It would just be easiest if the SBML3+fbc models were also available for download as either plaintext or gz files.

nel3 commented 9 years ago

I agree with ali Why not use gz instead?

Sent from my Android phone. On Jul 8, 2015 12:14 PM, "Ali Ebrahim" notifications@github.com wrote:

Hey guys,

I'm just curious why zip was chosen as the compression format instead of something gz?

The reason I bring that is is that as of opencobra/cobrapy@128be0c https://github.com/opencobra/cobrapy/commit/128be0c cobrapy can read and write directly with .xml.gz and .xml.bz2, but I'm not sure if I can implement this with zip easily (because a zip file can have an arbitrary internal structure).

I think MATLAB also has native gzip support as well, so this could likely be implemented in the cobra toolbox as well.

— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/107.

draeger commented 9 years ago

Ok, I am changing it today.

matthiaskoenig commented 9 years ago

Is it possible to also provide the uncompressed polished files in addition to the gz (reported as issue #104)?

draeger commented 9 years ago

Am 08.07.2015 um 11:41 schrieb nel3 notifications@github.com:

I agree with ali Why not use gz instead?

Hi Ali,

Just a short test revealed that gz has a much worse compression rate than zip. It is still ok but not as effective.

Cheers Dr. Andreas Dräger Bioengineering Dept., Systems Biology Research Group, Office #2506 University of California, San Diego, La Jolla, CA 92093-0412, USA Phone: +1-858-534-9717, Fax: +1-858-822-3120, twitter: @dr_drae

draeger commented 9 years ago

Hi Matthias,

We could do that, but the purpose of these compressed files is to save traffic for users and us. These uncompressed files can be insanely large. I think it is much more effective to work with compressed files.

Thanks

nel3 commented 9 years ago

While not as effective at compression, the gz has the advantage of that some tools can use them as is, as Ali mentioned. Furthermore, I would expect that the most heavily used models will be downloaded only a few thousand times, so in the grand scheme of things, a slight loss in compression won't make a big difference. But that's just my thoughts.

Sent from my Android phone. On Jul 8, 2015 5:35 PM, "Andreas Dräger" notifications@github.com wrote:

Hi Matthias,

We could do that, but the purpose of these compressed files is to save traffic for users and us. These uncompressed files can be insanely large. I think it is much more effective to work with compressed files.

Thanks

— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/107#issuecomment-119770634.

aebrahim commented 9 years ago

Yeah. I agree with Nate. We aren't going to be using much bandwidth at all.

Also if you use gzip --best the file sizes are comparable (268k vs 267k for iAF 1260) While not as effective at compression, the gz has the advantage of that some tools can use them as is, as Ali mentioned. Furthermore, I would expect that the most heavily used models will be downloaded only a few thousand times, so in the grand scheme of things, a slight loss in compression won't make a big difference. But that's just my thoughts.

Sent from my Android phone. On Jul 8, 2015 5:35 PM, "Andreas Dräger" notifications@github.com wrote:

Hi Matthias,

We could do that, but the purpose of these compressed files is to save traffic for users and us. These uncompressed files can be insanely large. I think it is much more effective to work with compressed files.

Thanks

— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/107#issuecomment-119770634.

— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/107#issuecomment-119772554.

zakandrewking commented 9 years ago

I think providing the uncompressed and compressed files cannot hurt (hence #104).

None of the uncompressed files are more than 15 Mb, so it's pretty reasonable to serve them to the occasional user who wants the slower download. We should definitely put in file sizes next to the download links.