Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
68 stars 22 forks source link

Bloated #115

Closed mileslucas closed 5 years ago

mileslucas commented 5 years ago

I just did a fresh clone of the repo and it is ~760 MB- that's huge! Part of the problem is that git keeps versions of every file, even files we don't care for the history of, like data files. I quickly went through and got the size of each folder

attic - 54MB
data - 6.7MB
docs - 860KB
filters - 38MB
notebooks - 93MB
plots - 58MB
scripts - 248KB
Starfish - 272KB

This is on the master branch. I definitely think the attic, notebooks, and plots need to be offloaded somehow. The data isn't actually taking up too much space but I still think in principle it shouldn't be on git. Cleaning up the aforementioned would clear up ~200MB of data. This is not anything necessarily urgent but is important to consider for long-term maintenance since the repo will only grow with more commits and more branches.

iancze commented 5 years ago

I absolutely agree! I noticed this when I was creating a tarball of the repo for archiving purposes. Since that is now complete, we can remove these. If you happen to know the right git foo to do this (for the history too) then by all means remove attic, notebooks, and plots.

mileslucas commented 5 years ago

Okay, just to make sure the implications are understood- I can remove all of those files forever from the git repo but that means all of those files are forever gone from the git repo :) They will only be accessible via the release-pinned tar balls and wherever you have them. I think that's fine but I just want to double check before irreversibly removing.

iancze commented 5 years ago

Thanks for checking. Indeed, I think we're ready to delete. I have them safely backed up on a hard-disk, as well as uploaded into an extra Zenodo repo here in case we need it (though I doubt we will): https://zenodo.org/record/2376426