llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.27k stars 11.67k forks source link

Strange packaging issue (I think) in regards to the tarball of LLVM (as well as clang and compiler-rt) #54585

Open rubyFeedback opened 2 years ago

rubyFeedback commented 2 years ago

Hey there llvm team,

I am following the instructions of BLFS here:

https://www.linuxfromscratch.org/blfs/view/svn/general/llvm.html

So first I download the LLVM tarball:

https://github.com/llvm/llvm-project/releases/download/llvmorg-14.0.0/llvm-14.0.0.src.tar.xz

However had, when extracting this, I end up with two directories:

cmake/
llvm-14.0.0.src/

The second one is expected and is how the LLVM team packaged things before 14.0.0.

The cmake/ directory on the toplevel, though, is confusing. Does this really have to be where it is? I'd assume it would be part of llvm-14.0.0.src/ instead. Normally all that is extract goes into only one directory, but now I end up with two.

Anyway.I thought I don't need this, so I removed it, and only repackaged llvm-14.0.0.src/ (I keep all source archives on my home system, in order to be able to compile this on other, older computers without internet connection).

compiler-rt and clang tarball have the same issue.

I tried to continue via the BLFS instructions but ran into errors, so perhaps I need that cmake/ directory. But as I was not sure, and the BLFS instructions did not mention it, I stopped for now. So the issue request there is mostly to confirm by the LLVM team that we really need a separate cmake/ directory now. Not even the KDE team requires this, by the way, and they use a LOT of cmake too. So perhaps this is a mistake?

Could someone have a look and try to determine whether a separate cmake/ directory really has to be part of the official tarballs? It's different to how LLVM packaged things in the past, so perhaps this is not what was wanted. My later compile problem may not have anything to do with this, but I just want to make sure. If someone could investigate this would be nice; please close this issue at any moment in time whenever you'd like to (or, if a fix/change is made, then perhaps when the next minor release of 14.0.x series is released).

Last but not least, not while it is that important,perhaps rt-compiler and clang could be part of the main LLVM distribution? This is just for convenience. I can download the other tarballs just fine as-is, but I'd love to be able to just have a single tarball, and then go from there (e. g. if clang is necessary or not, if compiler-rt is necessary or not, e. g. via CMAKE configure options or something like that). For me a single tarball, even if larger, would be a LOT more convenient - but as said, this is an aside.

Thank you for reading this lengthy issue request.

DimitryAndric commented 2 years ago

@tstellar you might have some clues? Maybe this is an artifact of splitting up the llvm-project directory into separate llvm, clang, etc tarballs?

efriedma-quic commented 2 years ago

You can get all of llvm-project (llvm/clang/compiler-rt/libc++/etc.) as a single tarball: https://github.com/llvm/llvm-project/releases/download/llvmorg-14.0.0/llvm-project-14.0.0.src.tar.xz . This is basically what you'd get if you just "git clone llvm-project", and reflects the directory structure developers and buildbots actually use for most builds.

The separate llvm/clang/compiler-rt/etc. tarballs each contain the subset of the tree necessary to perform a standalone build that subproject. That used to be just the one directory, but now there's a separate "cmake" directory at the root for common CMake utilities that are shared across LLVM. The directory structure looks the way it does because that's where the build system expects to find the utilities.

Looking at https://www.linuxfromscratch.org/blfs/view/svn/general/llvm.html , we don't recommend copying the clang source code into llvm/tools/clang, or the compiler-rt source code into llvm/projects/compiler-rt. Since we switched to the monorepo a few years ago, the standard directory structure puts them into adjacent directories.

The simplest way to build llvm+clang+compiler-rt in a single CMake invocation is to download the llvm-project tarball, then pass "-DLLVM_ENABLE_PROJECTS=clang" and "-DLLVM_ENABLE_RUNTIMES=compiler-rt" to the CMake invocation. See https://llvm.org/docs/GettingStarted.html .

It's also possible to use separate CMake invocations for each of llvm/clang/compiler-rt. See https://compiler-rt.llvm.org/ for directions for compiler-rt; I don't think we document it for clang, but something similar should work. The separate tarballs can be extracted anywhere for this usage.

So I think this is all working as intended. But maybe we should have a documentation page describing this...

efriedma-quic commented 2 years ago

Also, maybe we should consider restructuring the tarballs? Having multiple directories at the root of the tar file is sort of non-standard/confusing.