roed314 / lmfdb_postgres

Issue tracker for the transition of the LMFDB from Mongo to Postgres
Other
0 stars 0 forks source link

removing big files from history #75

Open edgarcosta opened 6 years ago

edgarcosta commented 6 years ago

We have a couple of big files in our history, that were added by mistake and then removed.

You can generate the top 20 big files that don't exist anymore by doing: git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print substr($0,6)}' | sort --numeric-sort --key=2 | tail -n 1000 | xargs -n 3 bash -c 'if [[ ! -e ${2} ]]; then awk -v size=${1} -v file=${2} "BEGIN{printf \"%.2f %s\\n\", size/(1024*1024), file}"; fi' | tail -n 20

0.11 lmfdb/modular_forms/elliptic_modular_forms/backend/web_modforms.py
0.11 static/images/lmfdb-logo.svg
0.12 static/images/lmfdb-logo.svg
0.12 static/images/browseGraphHolo_22_14_3a.svg
0.14 scripts/belyi/raw_data.py
0.14 scripts/belyi/raw_data.py
0.14 scripts/belyi/raw_data.py
0.22 static/jquery.dataTables.js
0.22 static/jquery.dataTables.js
0.33 change_nf_polys.py
0.33 lmfdb/number_fields/change_nf_polys.py
0.36 lmfdb/number_fields/change_nf_polys.py
0.36 lmfdb/number_fields/change_nf_polys.py
0.36 lmfdb/number_fields/change_nf_polys.py
0.36 lmfdb/number_fields/change_nf_polys.py
1.47 lmfdb/higher_genus_w_automorphisms/re
1.47 lmfdb/higher_genus_w_automorphisms/pymongo
1.47 lmfdb/higher_genus_w_automorphisms/StringIO
2.68 scripts/belyi/raw_data.py
2.68 scripts/belyi/raw_data.py

Shall we try to remove them? If so, I believe that now is a great time to rewrite history.

GitHub gives us two options to do this: https://help.github.com/articles/removing-sensitive-data-from-a-repository/

roed314 commented 6 years ago

I agree that now is a good time if we're going to do it, but forcing everyone to rebase their code is pretty annoying (and if anyone merges we lose the benefit. It's only a couple megabytes....

Anyway, I'm pretty neutral.