h2020-westlife-eu / wp6-repository

https://h2020-westlife-eu.github.io/wp6-repository/
MIT License
0 stars 0 forks source link

create zip/tgz on special dataset's endpoint #35

Open TomasKulhanek opened 6 years ago

TomasKulhanek commented 6 years ago

I’d add some my ideas about zip/tgz of RAW data of repository. I propose to consider streaming the data instead of ZIP/TGZ them in batch thus a zip/tgz is made on demand Reason to do that

  1. User can select specific folders/files to be included in the zip stream.
  2. No need to store extremely big zip on server, no need to wait until big archive is made
  3. If you need to save space on repository storage, there can be configured compression on filesystem transparently – there are tools like BTRFS, no need to implement it on level of D6.2 code base
  4. Streaming is simple – made prototype – cgi script which compress the user’s directory on demand and sends stream directly to http session channel.
  5. Such stream can be redirected directly to other service: virtual folder storage.