abhiTronix / raspberry-pi-cross-compilers

Latest GCC Cross Compiler & Native (ARM & ARM64) CI generated precompiled standalone toolchains for all Raspberry Pis. 🍇
https://sourceforge.net/projects/raspberry-pi-cross-compilers
GNU General Public License v3.0
594 stars 104 forks source link

Any way to shrink distributions? #39

Closed positron96 closed 3 years ago

positron96 commented 4 years ago

Just downloaded cross-gcc-9.2.0-pi_2-3.tar.gz, it's 2Gb uncompressed. Is it really necessary?

For example, linaro gcc (7.5) is around 600 Mb uncompressed (100 Mb compressed) and can build RPI software when provided with around 200 Mb more of sysroot libraries (those 600Mb contain their own sysroot libraries, so total used size is even less).

The most part of your distribution is libexec folder, of which cc1, cc1plus and lto1 take some 170 Mb each. For linaro, those files are order of magnitude less (17 Mb-ish). Also, there are copies for 9.2.0 and 8.3.0, are they both needed?

Upon closer inspection, all binaries are 5-10 times larger then in linaro gcc 7.5.

abhiTronix commented 4 years ago

For linaro, those files are order of magnitude less (17 Mb-ish). Also, there are copies for 9.2.0 and 8.3.0, are they both needed?

It's a short-coming of having the old version Glibc on Raspbian OS(i.e. Glibc-v2.28 on buster), and thereby you cannot build new GCC with old Glibc or vice-versa. Therefore, If you are trying to build Glibc 2.28 with a more modern GCC, like 9.x, you are going to get a lot of build errors due to incompatibility. Hence you need both GCC 8.3.0 and GCC 9.3.0, and my approach is to build Glibc with default compatible GCC 8.3.0 and later use it with the latest and greatest GCC, which at this time is 9.3.0.

abhiTronix commented 4 years ago

@positron96 Also see this for GCC-Binutils version compatibility matrix.

positron96 commented 4 years ago

Thanks for the info. Although this does not answer why all the binaries are a lot fatter. Aren't they built with debugging symbols by chance?

abhiTronix commented 4 years ago

Although this does not answer why all the binaries are a lot fatter.

@positron96 Does anywhere linaro disclosed how they build there binaries? Can you share it?

positron96 commented 4 years ago

Unfortunately, I have no idea. But here what I've googled: https://lists.linaro.org/pipermail/linaro-toolchain/2012-January/001964.html - a very old and non-working link to linaro wiki with excerpts build commands provided.

More recently, they seem to have switcdhed to use of their own tool abe for building (that link is the closest thing to official documentation from linaro. There may exist a more comprehensive instructions, I just have not found them, I only googled for 5 mins). This may also help: https://gist.github.com/ivakyb/6d6d8aa3766b3ed1807cfd2a214b516e https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads/8-3-2019-03 - here it says that arm.com now distributes same toolchain as linaro and provides configs for abe tool. It also seem to have thorough instructions on building from linaro sources.

Also, it seems crosstool-ng can build linaro toolchain: https://elinux.org/RPi_Linaro_GCC_Compilation#Build_GCC_Linaro, So, build config is buried somewhere in that project as well.

abhiTronix commented 4 years ago

--enable-languages=c,c++

@positron96 I also found linaro only supports c and c++(in newer toolchains) but not fortran, That's also a one of reasons why our binaries are bigger.

abhiTronix commented 4 years ago

https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads/8-3-2019-03 - here it says that arm.com now distributes same toolchain as linaro and provides configs for abe tool. It also seem to have thorough instructions on building from linaro sources.

@positron96 This is the most important info. Yes, they are using something Linaro ABE (Advanced Build Environment) and provided ABE manifest files, which is specifically developed for this purpose.

Component Description
GCC 8.3 Repository: svn://gcc.gnu.org/svn/gcc/branches/ARM/arm-8-branch Revision: 269184 Sources provided in release source tar ball. GCC 8 branch based on revision id r269117 with some additional patches ported on top as described in Features section. Detail of changes in GCC 8.3.
glibc 2.28 Repository: git://sourceware.org/git/glibc.git Revision: 4aeff335ca19286ee2382d8eba794ae5fd49281a Release note
newlib 3.0.0 Repository: git://sourceware.org/git/newlib.git Revision: newlib-3.1.0 Release note
binutils 2.32 Repository: git://sourceware.org/git/binutils-gdb.git Revision: 0738b7acd30816902ccfbbb3eac16862f26985cb Release note
GDB 8.2.1 Repository: git://sourceware.org/git/binutils-gdb.git Revision: 07d117342c8d967b730a7193e2f879f22c60e88c GDB-with-python support for Python 2.7.6 (x86_64 builds). GDB-with-python support for Python 2.7.13 (i686-mingw32 builds). Release note

Also they are using same configuration as ours. All magic is because of there ABE build scripts.

positron96 commented 4 years ago

I have never seen nor used that ABE but it may just boil down to creating some Makefiles as all those cmake, qmake, autotools and Co do

abhiTronix commented 4 years ago

it may just boil down to creating some Makefiles as all those cmake, qmake, autotools and Co do

@positron96 I think I'm not qualified for this job, but someone from community does. I would appreciate a PR or help from someone with experience of building those toolchains. Also I'm extremely busy to try it of my own, therefore I'm pinning this now for bringing this issue to everyone's attention. Thank you for your help.

visglz commented 3 years ago

Just stumbled accross this issue. I guess the difference is that the files are not stripped.

When I compare the cross-pi gcc to the gcc installed on Ubuntu 20.04 the factor is about 4,6 :

$ du -sh aarch64-linux-gnu-gcc 
5,6M    aarch64-linux-gnu-gcc
$ file aarch64-linux-gnu-gcc 
aarch64-linux-gnu-gcc: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e55923f99e9bc30554b23aab821b6198539a5a83, for GNU/Linux 3.2.0, with debug_info, not stripped
$ du -sh  /usr/bin/x86_64-linux-gnu-gcc-9
1,2M    /usr/bin/x86_64-linux-gnu-gcc-9
$ file /usr/bin/x86_64-linux-gnu-gcc-9
/usr/bin/x86_64-linux-gnu-gcc-9: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=8e75ffbd83b20f3c080555c66d68ffdde106247b, for GNU/Linux 3.2.0, stripped

Manually stripping the binaries also brings the cross-pi gcc down to 1,2 MB:

$ strip aarch64-linux-gnu-gcc
$ file aarch64-linux-gnu-gcc 
aarch64-linux-gnu-gcc: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e55923f99e9bc30554b23aab821b6198539a5a83, for GNU/Linux 3.2.0, stripped
$ du -sh aarch64-linux-gnu-gcc 
1,1M    aarch64-linux-gnu-gcc

It should be as easy as calling make install-strip-gcc instead of make install-gcc. If you don't want to debug the compiler itself it shouldn't make any difference. For glibc it might be desireable to keep the libraries with the debug symbols if someone wants to debug into it.

abhiTronix commented 3 years ago

@visglz install-strip can break things if ran in build directory. Official Docs says:

install-strip should not strip the executables in the build directory which are being copied for installation. It should only strip the copies that are installed.

Normally we do not recommend stripping an executable unless you are sure the program has no bugs. However, it can be reasonable to install a stripped executable for actual execution while saving the unstripped executable elsewhere in case there is a bug.

Should it be ran before tar compression? Also, Is it safe enough? I need to run some tests.

positron96 commented 3 years ago

@visglz Thanks you for digging into this! I honestly don't think that gcc symbols are required to build programs for RPI. Only gcc devs would ever need this.

visglz commented 3 years ago

@visglz install-strip can break things if ran in build directory. Official Docs says:

install-strip should not strip the executables in the build directory which are being copied for installation. It should only strip the copies that are installed.

Normally we do not recommend stripping an executable unless you are sure the program has no bugs. However, it can be reasonable to install a stripped executable for actual execution while saving the unstripped executable elsewhere in case there is a bug.

Should it be ran before tar compression? Also, Is it safe enough? I need to run some tests.

I don't think it is much of an issue here. The docs you are referring to are general recommendations for Makefile based GNU projects what targets should be available and how they should behave. Of course it makes sense to not call "strip" directly in the build folder and overwrite the freshly linked exectuables but "install" the files first and then strip them in the target location. I guess the gcc developers implemented these rules correctly.

As gcc is using the autotools the ability to install stripped files is already build in and work as expected. Seems like the strip and chmod calls are happening in a temporary location:

abhiTronix commented 3 years ago

Of course it makes sense to not call "strip" directly in the build folder and overwrite the freshly linked exectuables but "install" the files first and then strip them in the target location.

So just before the tar compression, should I strip each executables separately? Or make install-strip-gcc command is enough?

@visglz Also, Did you tested the Bash-Scripts and is it all safe to make install-strip-gcc?

I'm busy in exams right now, will get back to it soon. Thank you for looking into this.

visglz commented 3 years ago

My time is also limited, so I was able only to test the 64 Bit version I was working with.

Please find the PR based on your latest branch for the 64Bit-Variant, including the proposed change for set -eo pipefail and a fix for an issue during download of linux kernel (without "-e" the error was silently ignored).

The size difference is impressive:

$ du -sh cross-gcc-8.3.0-pi_64.tar.gz 
160M    cross-gcc-8.3.0-pi_64.tar.gz
$ du -sh cross-gcc-8.3.0-pi_.tar.gz 
533M    cross-gcc-8.3.0-pi_.tar.gz

$ du -sh /opt/sdk/raspi64-202101221000/compiler
1,7G    /opt/sdk/raspi64-202101221000/compiler
$ du -sh /opt/sdk/raspi64-202102030850/compiler
477M    /opt/sdk/raspi64-202102030850/compiler

I was able to successfully compile my examples I used for the previous (unstripped) version. Also the amount of files is the same (checked the file trees).

It should be safe to merge it in your PR. But the changes should be done in the 32 Bit and CI files also (is there a reason to have different scripts for CI and local build?).

Good look for your exams! Take your time, this is not time critical for me.

abhiTronix commented 3 years ago

@visglz Thanks so much for this contribution. I'll look into shortly.

abhiTronix commented 3 years ago

@visglz What about GDB binaries, does they need stripping too? Any implications to that?

abhiTronix commented 3 years ago

Successfully resolved and merged in commit https://github.com/abhiTronix/raspberry-pi-cross-compilers/commit/7ead221f4f744a9dcd10f3ef07005422db68ec9e