GPlates / gplately

GPlately is a Python package to interrogate tectonic plate reconstructions.
https://gplates.github.io/gplately/
GNU General Public License v2.0
55 stars 12 forks source link

Oversized github repository #97

Open michaelchin opened 1 year ago

michaelchin commented 1 year ago

The size of this repository is 914M as of 13 Jun 2023. Write down this problem lest I forgot.

du -sh -- * .[^.]* | sort -h

Screenshot 2023-06-13 at 6 05 30 pm

Ignore history:

git clone --depth 1 --branch master https://github.com/GPlates/gplately.git

brmather commented 1 year ago

All of this is taken up by graphics and animations in the notebooks. I would like to clear these from all notebooks in the repo and flush them from the git history as well.

I've looked at a few solutions in the past but none were fantastic. Running jupyter nbconvert in a git hook might be one way. Alternatively we could use jupytext to properly version control notebooks as markdown files (or regular python files).

michaelchin commented 1 month ago

Just a note to myself.

use BFG to purge history https://rtyley.github.io/bfg-repo-cleaner/#download

michaelchin commented 3 weeks ago

Put this on ice until it causes real problems or too painful to bear.

brmather commented 1 week ago

use BFG to purge history https://rtyley.github.io/bfg-repo-cleaner/#download

Most of the space is taken up by Jupyter notebook outputs (images, gifs, etc.) Does BFG help with this?

I've found a couple more resources which may help with Jupyter notebooks:

https://zhauniarovich.com/post/2020/2020-10-clearing-jupyter-output-p3/ https://www.scivision.dev/git-jupyter-strip-output/

michaelchin commented 1 week ago

use BFG to purge history https://rtyley.github.io/bfg-repo-cleaner/#download

Most of the space is taken up by Jupyter notebook outputs (images, gifs, etc.) Does BFG help with this?

I've found a couple more resources which may help with Jupyter notebooks:

https://zhauniarovich.com/post/2020/2020-10-clearing-jupyter-output-p3/ https://www.scivision.dev/git-jupyter-strip-output/

BFG just purges history. The history may take up some space. I think we can try your findings first. This issue is not too painful for now.

Anyway, people can always ignore history by

git clone --depth 1 --branch master https://github.com/GPlates/gplately.git

See the screenshot below. The history is not a big problem anymore.

du -sh -- * .[^.]* | sort -h

Screenshot 2024-07-11 at 1 54 10 PM