mhasoba / TheMulQuaBio

The interactive, online Multilingual Quantitative Biologist
MIT License
37 stars 79 forks source link

Bloated git histories? #130

Open mhasoba opened 6 days ago

mhasoba commented 6 days ago

So I am thinkin about adding a section in the Git chapter about how to wiggle out of a situation where you are stuck with a bloated git history because you unthinkingly (as many of us would in our git larval stage) commited a massive file to history. How do you get rid of that burden?

There is an old thread about this on stackoverflow: https://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-the-git-repository

Thoughts?

Originally posted by @mhasoba in https://github.com/mhasoba/TheMulQuaBio/discussions/98#discussioncomment-3894703

davidorme commented 6 days ago

It's useful to know, but I'd argue that a better thing to add is pre-commit. That introduces a huge array of code QA and validation tools - and its industry standard practice for research software engineering. And one of of the common hooks is:

https://github.com/pre-commit/pre-commit-hooks?tab=readme-ov-file#check-added-large-files

That should stop you getting into that mess in the first place. That way we help our larvae be better than us.

mhasoba commented 5 days ago

@hel0122 , do you mind adding a note box about this (including the link) to the Git Chapter nb?