llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.32k stars 11.69k forks source link

[flang][driver] impossible to combine `-shared` with linking to `Fortran_main` (flang 18.1) #92496

Open h-vetinari opened 4 months ago

h-vetinari commented 4 months ago

https://github.com/llvm/llvm-project/commit/9d6837d595719904720e5ff68ec1f1a2665bdc2f removed the ability to link to Fortran_main by default when building -shared libraries. However, building such libraries is how SciPy works under the hood. Shared libraries coming from C/C++/Fortran, which are then accessed from Python.

I managed to build SciPy with flang 17 (bit of a long story), and was now trying to update to flang 18. Aside from issues like #92459, I cannot figure out how to combine -shared with actually linking to Fortran_main. Not that I care particularly about the philosophical aspects of that, but very practically, I get missing symbols without it

[1376/1475] Linking target scipy/linalg/_interpolative.cp310-win_amd64.pyd
FAILED: scipy/linalg/_interpolative.cp310-win_amd64.pyd 
"flang-new"  -Wl,/OUT:[...objects...] "-Wl,/nologo" "-Wl,/OPT:REF" "-Wl,/DLL" "-Wl,/IMPLIB:scipy\linalg\_interpolative.cp310-win_amd64.lib" "--target=x86_64-pc-windows-msvc" "-fms-runtime-lib=dll" "-fuse-ld=lld" "-Wl,-defaultlib:%PREFIX%/lib/clang/18/lib/windows/clang_rt.builtins-x86_64.lib" "-D_CRT_SECURE_NO_WARNINGS" "-D_MT" "-D_DLL" "--target=x86_64-pc-windows-msvc" "scipy/lib_fortranobject.a" "-Wl,--version-script=C:/bld/scipy-split_1715896809937/work/scipy/_build_utils/link-version-pyinit.map" "%PREFIX%/lib/lapack.lib" "%PREFIX%/lib/blas.lib" "%PREFIX%\libs\python310.lib" "-lFortranRuntime" "-lFortranDecimal"
lld-link: warning: ignoring unknown argument '--version-script=C:/bld/scipy-split_1715896809937/work/scipy/_build_utils/link-version-pyinit.map'
lld-link: error: undefined symbol: _QQmain

Surprisingly, even with a "-lFortran_main" at the end of that invocation, I get the same missing symbols. I've tried a couple of variations but could get none of them to work. The -fno-fortran-main flag mentioned in that PR doesn't seem to have a positive formulation that does enforce linkage to Fortran_main.

As an aside, this is not just for SciPy, relatedly I'm also trying to teach meson how to handle llvm-flang, and this is one of the things I need to figure out how to do. Finally, a lot of the changes from the commit mentioned at the top have been undone in https://github.com/llvm/llvm-project/commit/8d5386669ed63548daf1bee415596582d6d78d7d, but that came after flang 18. It needs to be possible to do this with released versions (or in the extreme case that it isn't, I need to error out in meson and tell people to upgrade their compiler).

CC @mjklemm @DavidTruby

llvmbot commented 4 months ago

@llvm/issue-subscribers-flang-driver

Author: None (h-vetinari)

https://github.com/llvm/llvm-project/commit/9d6837d595719904720e5ff68ec1f1a2665bdc2f removed the ability to link to `Fortran_main` by default when building `-shared` libraries. However, building such libraries is how SciPy works under the hood. Shared libraries coming from C/C++/Fortran, which are then accessed from Python. I managed to build SciPy with flang 17 (bit of a [long story](https://labs.quansight.org/blog/building-scipy-with-flang)), and was now trying to update to flang 18. Aside from issues like #92459, I cannot figure out how to combine `-shared` with actually linking to `Fortran_main`. Not that I care particularly about the philosophical aspects of that, but very practically, I get missing symbols without it ``` [1376/1475] Linking target scipy/linalg/_interpolative.cp310-win_amd64.pyd FAILED: scipy/linalg/_interpolative.cp310-win_amd64.pyd "flang-new" -Wl,/OUT:[...objects...] "-Wl,/nologo" "-Wl,/OPT:REF" "-Wl,/DLL" "-Wl,/IMPLIB:scipy\linalg\_interpolative.cp310-win_amd64.lib" "--target=x86_64-pc-windows-msvc" "-fms-runtime-lib=dll" "-fuse-ld=lld" "-Wl,-defaultlib:%PREFIX%/lib/clang/18/lib/windows/clang_rt.builtins-x86_64.lib" "-D_CRT_SECURE_NO_WARNINGS" "-D_MT" "-D_DLL" "--target=x86_64-pc-windows-msvc" "scipy/lib_fortranobject.a" "-Wl,--version-script=C:/bld/scipy-split_1715896809937/work/scipy/_build_utils/link-version-pyinit.map" "%PREFIX%/lib/lapack.lib" "%PREFIX%/lib/blas.lib" "%PREFIX%\libs\python310.lib" "-lFortranRuntime" "-lFortranDecimal" lld-link: warning: ignoring unknown argument '--version-script=C:/bld/scipy-split_1715896809937/work/scipy/_build_utils/link-version-pyinit.map' lld-link: error: undefined symbol: _QQmain ``` Surprisingly, even _with_ a `"-lFortran_main"` at the end of that invocation, I get the same missing symbols. I've tried a couple of variations but could get none of them to work. The `-fno-fortran-main` flag mentioned in that PR doesn't [seem](https://releases.llvm.org/18.1.0/tools/flang/docs/FlangCommandLineReference.html#cmdoption-flang-fno-fortran-main) to have a positive formulation that _does_ enforce linkage to `Fortran_main`. As an aside, this is not just for SciPy, relatedly I'm also trying to teach meson how to handle llvm-flang, and this is one of the things I need to figure out how to do. Finally, a lot of the changes from the commit mentioned at the top have been undone in https://github.com/llvm/llvm-project/commit/8d5386669ed63548daf1bee415596582d6d78d7d, but that came after flang 18. It needs to be possible to do this with released versions (or in the extreme case that it isn't, I need to error out in meson and tell people to upgrade their compiler). CC @mjklemm @DavidTruby
h-vetinari commented 4 months ago

@mjklemm, could you let me know your thoughts about the situation here? I'm happy to test exploratory patches.

mjklemm commented 4 months ago

Hm, I do not see why the shared-object file should contains a program unit. I do not see the missing symbol being added when I build shared-object file from a simple Fortran file.

Does any of the translation units that make up the shared-object file contain a program unit (by accident)?

h-vetinari commented 4 months ago

Does any of the translation units that make up the shared-object file contain a program unit (by accident)?

Probably! But I'd like to not have to audit the project I'm building (and have been able to build previously) to figure out through many build layers where this goes wrong. I mean, it would be good to do that, but with more time - right now I'm stuck in not being able to build already-published versions with flang 18.

h-vetinari commented 3 months ago

So I tried to build flang from main just to test if the current situation is any better, but blocked there as well: #95698 😑

h-vetinari commented 3 months ago

Since SciPy isn't buildable with flang 18 anymore due to this (also being encountered by other communities), I think this should be considered a regression; unsurprisingly, I would really like to see this fixed for flang 19. I'm willing to test things (despite building from main here being a big operation that takes ~48h), but currently blocked on https://github.com/llvm/llvm-project/issues/95698. Would appreciate help/inputs.

DavidTruby commented 3 months ago

This particular issue should be solved by #89938 as Fortran_main doesn't exist anymore, and 'main' is emitted directly when needed. This patch came after the 18 release.

I would be opposed to backporting that change to the 18 branch; I think the backport could be quite messy and it's a significant change in behaviour (as well as accepted flags) to force back into an existing release.

h-vetinari commented 3 months ago

Thanks for the response! I had already found the related change and pointed it out in the OP:

Finally, a lot of the changes from the commit mentioned at the top have been undone in https://github.com/llvm/llvm-project/commit/8d5386669ed63548daf1bee415596582d6d78d7d, but that came after flang 18.

I agree that backporting this is not going to happen (also, the 18.x branch already had its last release), which is why I wanted to test the state of the main branch but found myself blocked by https://github.com/llvm/llvm-project/issues/95698. You were already tagged in that issue - I'd be very grateful if you could have a look there, and once there's a fix (or a PR) I'd be happy to retest.

h-vetinari commented 3 months ago

Happy to report that with flang built from main (more precisely: edf5782f1780f480c3ae3fc0a44bf5432f9aa48b), it's possible again to build SciPy. Pity that all patch releases of 18.1.x are unusable, but at least 19.1 should be in better shape!

If you want, this issue can thus be closed.

PS. @kiranchandramohan had mentioned on discourse:

Ideally if we can set up a CI that always tests main development branch with scipy on Windows then we can guard against regressions.

I'd be happy to help with that, though it'll need some currently-unreleased fixes to meson still (on that note: it would really be high time to finally get the flang-new rename over with...): https://github.com/mesonbuild/meson/pull/13323