riscv-collab / riscv-gnu-toolchain

GNU toolchain for RISC-V, including GCC
Other
3.56k stars 1.17k forks source link

Use Shallow Submodule Init to Reduce Clone Times #1605

Open TShapinsky opened 3 weeks ago

TShapinsky commented 3 weeks ago

Reopening with new PR due to being unable to reopen #1603 (force push/rebase caused history mismatch).

Added --depth 1 to the git submodule update command in Makefile.in.

To test I setup a docker image which built the repository with ./configure --prefix=/opt/riscv --with-arch=rv32gc --with-abi=ilp32d. The image with this change was 14.77 GB where as the version without this change was 16.93 GB. That is a savings of 2.16 GB. This won't be a big deal for people developing in a stable environment where they only init the submodules once, but for cases where the build is happening in a container this is a big gain.

A different approach would be to populate the shallow property in .gitmodules. However, this approach only saves about 100MB, perhaps because the shallow property is not applied recursively.

cmuellner commented 3 weeks ago

Looks like the git server of uclibc-ng does not like shallow clones.

I've seen that shallow clones can be configured in .gitmodules. Maybe something like this works (untested):

diff --git a/.gitmodules b/.gitmodules
index 68c2932..058d88a 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -2,44 +2,56 @@
        path = binutils
        url = https://sourceware.org/git/binutils-gdb.git
        branch = binutils-2_43-branch
+       shallow = true
 [submodule "gcc"]
        path = gcc
        url = https://gcc.gnu.org/git/gcc.git
        branch = releases/gcc-14
+       shallow = true
 [submodule "glibc"]
        path = glibc
        url = https://sourceware.org/git/glibc.git
+       shallow = true
 [submodule "dejagnu"]
        path = dejagnu
        url = https://git.savannah.gnu.org/git/dejagnu.git
        branch = master
+       shallow = true
 [submodule "newlib"]
        path = newlib
        url = https://sourceware.org/git/newlib-cygwin.git
        branch = master
+       shallow = true
 [submodule "gdb"]
        path = gdb
        url = https://sourceware.org/git/binutils-gdb.git
        branch = gdb-15-branch
+       shallow = true
 [submodule "qemu"]
        path = qemu
        url = https://gitlab.com/qemu-project/qemu.git
+       shallow = true
 [submodule "musl"]
        path = musl
        url = https://git.musl-libc.org/git/musl
        branch = master
+       shallow = true
 [submodule "spike"]
        path = spike
        url = https://github.com/riscv-software-src/riscv-isa-sim.git
        branch = master
+       shallow = true
 [submodule "pk"]
        path = pk
        url = https://github.com/riscv-software-src/riscv-pk.git
        branch = master
+       shallow = true
 [submodule "llvm"]
        path = llvm
        url = https://github.com/llvm/llvm-project.git
        branch = release/17.x
+       shallow = true
 [submodule "uclibc-ng"]
        path = uclibc-ng
        url = https://git.uclibc-ng.org/git/uclibc-ng.git
+       shallow = false
TShapinsky commented 3 weeks ago

Looks like the git server of uclibc-ng does not like shallow clones.

I've seen that shallow clones can be configured in .gitmodules. Maybe something like this works (untested):

diff --git a/.gitmodules b/.gitmodules
index 68c2932..058d88a 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -2,44 +2,56 @@
        path = binutils
        url = https://sourceware.org/git/binutils-gdb.git
        branch = binutils-2_43-branch
+       shallow = true
 [submodule "gcc"]
        path = gcc
        url = https://gcc.gnu.org/git/gcc.git
        branch = releases/gcc-14
+       shallow = true
 [submodule "glibc"]
        path = glibc
        url = https://sourceware.org/git/glibc.git
+       shallow = true
 [submodule "dejagnu"]
        path = dejagnu
        url = https://git.savannah.gnu.org/git/dejagnu.git
        branch = master
+       shallow = true
 [submodule "newlib"]
        path = newlib
        url = https://sourceware.org/git/newlib-cygwin.git
        branch = master
+       shallow = true
 [submodule "gdb"]
        path = gdb
        url = https://sourceware.org/git/binutils-gdb.git
        branch = gdb-15-branch
+       shallow = true
 [submodule "qemu"]
        path = qemu
        url = https://gitlab.com/qemu-project/qemu.git
+       shallow = true
 [submodule "musl"]
        path = musl
        url = https://git.musl-libc.org/git/musl
        branch = master
+       shallow = true
 [submodule "spike"]
        path = spike
        url = https://github.com/riscv-software-src/riscv-isa-sim.git
        branch = master
+       shallow = true
 [submodule "pk"]
        path = pk
        url = https://github.com/riscv-software-src/riscv-pk.git
        branch = master
+       shallow = true
 [submodule "llvm"]
        path = llvm
        url = https://github.com/llvm/llvm-project.git
        branch = release/17.x
+       shallow = true
 [submodule "uclibc-ng"]
        path = uclibc-ng
        url = https://git.uclibc-ng.org/git/uclibc-ng.git
+       shallow = false

Yeah, I'm testing this out locally. I can't get it to take... but hopefully I'll figure it out soon.

It would definitely be preferable to do it this way instead of forcing it.

karaketir16 commented 1 week ago

I did not notice this pull request.

I created #1616, I hope it helps.