ChainSafe / forest

🌲 Rust Filecoin Node Implementation
https://forest.chainsafe.io
Apache License 2.0
638 stars 156 forks source link

Don't deploy Rust docs to GH pages #3731

Closed LesnyRumcajs closed 1 month ago

LesnyRumcajs commented 11 months ago

Issue summary

Self-hosting Rust docs makes regular Forest clones/forks quite heavy; 419 MB for latest main b7ac40354c713ef2e087a6e4c1f0c59c61a28bef.

The largest culprit is the Rust documentation.

Use this command

git rev-list --objects --all |   git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |   sed -n 's/^blob //p' |   sort --numeric-sort --key=2 |   cut -c 1-12,41- |   $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest | col

Sample output:

...
2e6ab4cd5af  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
fcc469f254c8  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
3a3a46cdb2c0  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
46c9f562f2cc  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
003aba4fb25f  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
5baeacce618e  4.8MiB rustdoc/implementors/core/panic/unwind_safe/trait.RefUnwindSafe.js
...
df90f0b30cc4   16MiB rustdoc/search-index.js
e8500f3b8af1   16MiB rustdoc/search-index.js
05e123abbb51   16MiB rustdoc/search-index.js
6cebbd16bff8   16MiB rustdoc/search-index.js
a74c4b64e5c4   16MiB rustdoc/search-index.js
c830e0d1c911   16MiB rustdoc/search-index.js
4cb635118402   16MiB rustdoc/search-index.js
ba3b8b29d3d9   16MiB rustdoc/search-index.js
31f710c25df8   17MiB rustdoc/search-index.js

This is too large and must be dealt with.

Other information and links

aatifsyed commented 11 months ago

Other big rust projects must face this issue - I'll have a look at how e.g servo addresses this

See also https://github.com/rust-lang/rust/issues/31387

aatifsyed commented 11 months ago

Known issue: https://www.reddit.com/r/rust/comments/wy3j50/psa_if_youre_using_ghpages_to_host_your/

As a first-step measure, we changed the CI script to overwrite the gh-pages branch at every run, rather than just appending a new commit. We use this gh-pages action, so it was just a matter of adding a force_orphan: true parameter.

Turns out that this improved the situation a lot: when it no longer needs to keep history, git manages to compress that 220MB of documentation very well, and now the whole Smithay git repo is only ~15 MB!

aatifsyed commented 11 months ago

We currently use JamesIves/github-pages-deploy-action, which looks like it commits to a gh-pages branch? https://github.com/ChainSafe/forest/blob/b7ac40354c713ef2e087a6e4c1f0c59c61a28bef/.github/workflows/docs.yml#L81-L87

Maybe switching to actions/deploy-pages would be better, or maybe the fork mentioned in the reddit thread

ansermino commented 6 months ago

I would suggest exploring using Cloudflare pages for this, as well as pruning the git history. Cloning the full repo sucks 😁

LesnyRumcajs commented 1 month ago

Done via https://github.com/ChainSafe/forest/pull/4792