carlos-gg / VIP_extras

Datacubes, Jupyter tutorials and other materials related to VIP (https://github.com/vortex-exoplanet/VIP)
3 stars 5 forks source link

IDEA: move hosting of data cubes to bintray.com #7

Open r4lv opened 6 years ago

r4lv commented 6 years ago

problem

Git is not made for storing large binary files: On every change to a binary file, the entire file is stored again. When git clone, the entire history is downloaded, and with it all versions of the binary files.

Small files which are never changed are an acceptable overhead, like the current files in this repository, but I think VIP_extras will grow over time, e.g. the 4D IFS cube I have ready weights 22Mb (cropped).

alternatives

git-lfs

git-lfs is the "large file storage" for git, developed (and supported) by GitHub to address exactly that problem. Once git-lfs is set up for a repository (e.g. "track every .fits and .npz file"), one can use the regular git commands as before. Under the hood, the large files are not stored in the repository, but just their reference, while git uploads the large files to a special server.

advantages

disadvantages

bintray

advantages

disadvantages

demo

I created a bintray project for VIP, and uploaded the IFS cube for testing.

Take a look at the project site: https://bintray.com/r4lv/vip/data-cubes

Using the files in python would be

from astropy.utils.data import download_file

fn = download_file("https://dl.bintray.com/r4lv/vip/IFS_HD64568.vip.npz")
dataset = vip.HCIDataset.load(fn)