WebAssembly / wasi-sdk

WASI-enabled WebAssembly C/C++ toolchain
Apache License 2.0
1.23k stars 176 forks source link

Wondering how to build LLVM in such a small size #201

Closed bathtub-01 closed 2 years ago

bathtub-01 commented 2 years ago

Hello, I'm new to LLVM and WebAssembly. I learned that wasi-sdk is built directly through the upstream LLVM and wasi-libc. Out of curiosity, I want to try to reproduce this process by myself. I downloaded the LLVM source code and tried to build it. When building the LLVM project, I used the following parameters:

$ cmake ../llvm 
   -DCMAKE_BUILD_TYPE=Release 
   -DLLVM_ENABLE_PROJECTS='lld;clang’ 
   -DLLVM_TARGETS_TO_BUILD="WebAssembly" 
   -DLLVM_INCLUDE_EXAMPLES=OFF 
   -DLLVM_INCLUDE_TESTS=OFF

I have tried my best to reduce the things that need to be built, but the build size was still up to 2.5GB. In contrast, the size of wasi-sdk is only over 100MB. Obviously I added extra stuff to the build, but I don't know how to further streamline it. Did the developers of wasi-sdk make some cuts to the llvm code, or did they use a special CmakeList.txt? Thanks for your attention.

sbc100 commented 2 years ago

How are you measuring the build size? Are you running cmake --install and then measuring the size of the resulting directory?

bathtub-01 commented 2 years ago

How are you measuring the build size? Are you running cmake --install and then measuring the size of the resulting directory?

I create a build directory under the llvm-project directory. And then I runs

$ cmake ../llvm 
   -DCMAKE_BUILD_TYPE=Release 
   -DLLVM_ENABLE_PROJECTS='lld;clang’ 
   -DLLVM_TARGETS_TO_BUILD="WebAssembly" 
   -DLLVM_INCLUDE_EXAMPLES=OFF 
   -DLLVM_INCLUDE_TESTS=OFF

under the llvm-project/build directory to produce the Makefile. And then I use make under the same directory. The “build size” I mentioned refers to the size of the llvm-project/build directory.

sbc100 commented 2 years ago

That size will include all the intermediate files such as object files. You would need to run cmake --install to find if you want to package, ship, or measure the resulting toolchain. You probably also want to run run strip (or llvm-strip) on all the binaries too.

bathtub-01 commented 2 years ago

That size will include all the intermediate files such as object files. You would need to run cmake --install to find if you want to package, ship, or measure the resulting toolchain. You probably also want to run run strip (or llvm-strip) on all the binaries too.

I tried to run

$ cmake -DCMAKE_INSTALL_PREFIX=/tmp/llvm -P cmake_install.cmake

under the build directory. And I found that the installation directory(/tmp/llvm/) is still large(about 1.9GB). I compared my installation directory and the wasi-sdk directory, and I found that there are still a lot of extra things. These are the binaries in wasi-sdk:

wasi-sdk-12.0/bin
├── ar -> llvm-ar
├── c++filt -> llvm-cxxfilt
├── clang -> clang-11
├── clang++ -> clang
├── clang-11
├── clang-apply-replacements
├── clang-cl -> clang
├── clang-cpp -> clang
├── clang-format
├── clang-tidy
├── git-clang-format
├── ld64.lld -> lld
├── ld.lld -> lld
├── lld
├── lld-link -> lld
├── llvm-ar
├── llvm-cxxfilt
├── llvm-dwarfdump
├── llvm-nm
├── llvm-objcopy
├── llvm-objdump
├── llvm-ranlib -> llvm-ar
├── llvm-size
├── llvm-strings
├── llvm-strip -> llvm-objcopy
├── nm -> llvm-nm
├── objcopy -> llvm-objcopy
├── objdump -> llvm-objdump
├── ranlib -> llvm-ar
├── size -> llvm-size
├── strings -> llvm-strings
├── strip -> llvm-objcopy
└── wasm-ld -> lld

And these are what I got:

/tmp/llvm/bin
├── analyze-build
├── bugpoint
├── c-index-test
├── clang -> clang-14
├── clang++ -> clang
├── clang-14
├── clang-check
├── clang-cl -> clang
├── clang-cpp -> clang
├── clang-extdef-mapping
├── clang-format
├── clang-nvlink-wrapper
├── clang-offload-bundler
├── clang-offload-wrapper
├── clang-refactor
├── clang-rename
├── clang-repl
├── clang-scan-deps
├── diagtool
├── dsymutil
├── git-clang-format
├── hmaptool
├── intercept-build
├── ld64.lld -> lld
├── ld64.lld.darwinnew -> lld
├── ld64.lld.darwinold -> lld
├── ld.lld -> lld
├── llc
├── lld
├── lld-link -> lld
├── lli
├── llvm-addr2line -> llvm-symbolizer
├── llvm-ar
├── llvm-as
├── llvm-bcanalyzer
├── llvm-bitcode-strip -> llvm-objcopy
├── llvm-cat
├── llvm-cfi-verify
├── llvm-config
├── llvm-cov
├── llvm-c-test
├── llvm-cvtres
├── llvm-cxxdump
├── llvm-cxxfilt
├── llvm-cxxmap
├── llvm-diff
├── llvm-dis
├── llvm-dlltool -> llvm-ar
├── llvm-dwarfdump
├── llvm-dwp
├── llvm-exegesis
├── llvm-extract
├── llvm-gsymutil
├── llvm-ifs
├── llvm-install-name-tool -> llvm-objcopy
├── llvm-jitlink
├── llvm-lib -> llvm-ar
├── llvm-libtool-darwin
├── llvm-link
├── llvm-lipo
├── llvm-lto
├── llvm-lto2
├── llvm-mc
├── llvm-mca
├── llvm-ml
├── llvm-modextract
├── llvm-mt
├── llvm-nm
├── llvm-objcopy
├── llvm-objdump
├── llvm-opt-report
├── llvm-otool -> llvm-objdump
├── llvm-pdbutil
├── llvm-profdata
├── llvm-profgen
├── llvm-ranlib -> llvm-ar
├── llvm-rc
├── llvm-readelf -> llvm-readobj
├── llvm-readobj
├── llvm-reduce
├── llvm-rtdyld
├── llvm-sim
├── llvm-size
├── llvm-split
├── llvm-stress
├── llvm-strings
├── llvm-strip -> llvm-objcopy
├── llvm-symbolizer
├── llvm-tapi-diff
├── llvm-tblgen
├── llvm-undname
├── llvm-windres -> llvm-rc
├── llvm-xray
├── opt
├── sancov
├── sanstats
├── scan-build
├── scan-build-py
├── scan-view
├── split-file
├── verify-uselistorder
└── wasm-ld -> lld

What should I do to cut them further?

sbc100 commented 2 years ago

You can just use, or crib from the Makefile to wasi-sdk itself if you want to match more precisely what we do. For example here is the install command we use:

https://github.com/WebAssembly/wasi-sdk/blob/77ba98a998cb9f2a63ab3a5f94bbabd069f65ff0/Makefile#L70

Can you use that makefile to build the whole of wasi-sdk rather than roll you own script?

1.9gb seems very big. Did you run strip on the binaries? Can you see where the usage is coming from? What does du -sm /tmp/llvm/* say?

bathtub-01 commented 2 years ago

You can just use, or crib from the Makefile to wasi-sdk itself if you want to match more precisely what we do. For example here is the install command we use:

https://github.com/WebAssembly/wasi-sdk/blob/77ba98a998cb9f2a63ab3a5f94bbabd069f65ff0/Makefile#L70

Can you use that makefile to build the whole of wasi-sdk rather than roll you own script?

1.9gb seems very big. Did you run strip on the binaries? Can you see where the usage is coming from? What does du -sm /tmp/llvm/* say?

Oh thanks, the Makefile is a good reference. I changed the https://github.com/WebAssembly/wasi-sdk/blob/77ba98a998cb9f2a63ab3a5f94bbabd069f65ff0/Makefile#L5 to my previous llvm-project directory and it works. stripcommand does reduce the size of the binaries. I also notice that I can run make strip after installation. This is a convenient batch operation script. I think, next I only need to analyse the Makefile to understand your approach. Thank you for your help!