github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.56k stars 1.51k forks source link

Size of CodeQL binary distribution has almost trippled since 2021 #14125

Open intrigus-lgtm opened 1 year ago

intrigus-lgtm commented 1 year ago

Description of the issue

I've noticed that the size of the CodeQL binary distribution only knows one direction - upwards.

2021-04-29: 242 MB compressed, 476 MB uncompressed 2022-12-03: 417 MB compressed, 1.1GB uncompressed 2023-09-04: 541 MB compressed, 1.4 GB uncompressed

Now this is of course somewhat expected as there are new languages being added over the course of time and space is free; expect when it slowly isn't^^ I'm wondering whether there are some easy improvements possible. For example fdupes -r -S -m . in the tools/directory results in

89 duplicate files (in 8 sets), occupying 119.7 megabytes

and for example all of these files are identical:

``` 4092064 bytes each: ./linux64/lib64__trace ./linux64/lib/x86_64-linux-gnutrace.so ./linux64/lib/x86_64-linux-gnu_xeon_phi_trace.so ./linux64/lib/x86_64-linux-gnu_x86_64_trace.so ./linux64/lib/x86_64-linux-gnu_i686_trace.so ./linux64/lib/x86_64-linux-gnu_haswell_trace.so ./linux64/lib_xeon_phi_trace.so ./linux64/lib_x86_64_trace.so ./linux64/lib_haswell_trace.so ./linux64/lib64trace.so ./linux64/lib64_xeon_phi_trace.so ./linux64/lib64_x86_64_trace.so ./linux64/lib64_i686_trace.so ./linux64/lib64_haswell_trace.so ./linux64/x86_64-linux-gnutrace.so ./linux64/x86_64-linux-gnu_xeon_phi_trace.so ./linux64/x86_64-linux-gnu_x86_64_trace.so ./linux64/x86_64-linux-gnu_i686_trace.so ./linux64/x86_64-linux-gnu_haswell_trace.so ```

The swift/ subdirectory occupies more than 500 MB; maybe it is possible to try to build some of these binaries with -Osize or something similar? Feel free to close this issue, if you think this is not worthwhile or there are no easy gains.

MathiasVP commented 1 year ago

Hi @intrigus,

Thanks for raising this issue. We're definitely aware that the CodeQL distribution continues to increase in size, and this is something that we're paying very close attention to. In particularly, when we added support for Swift we had to pull in loads of new dependencies, and the bundle size increase came up many times in that discussion.

I will forward your size reduction suggests to the appropriate teams.