root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.7k stars 1.28k forks source link

Compilation error on redhat 8.3 / no internet #8292

Closed georgtroska closed 3 years ago

georgtroska commented 3 years ago

Hi please check here: https://root-forum.cern.ch/t/6-24-00-does-not-complile-on-redhat-8-3/45161

details:

Hello, I’m running into problems compiling root 6.24.00:

$ cmake -Dclad=OFF -DCMAKE_INSTALL_PREFIX=…/root_install …/root-6.24.00 $ cmake --build . – install -j8

… [ 76%] Linking CXX static library …/…/…/…/lib/libclingInterpreter.a [ 76%] Built target clingInterpreter Scanning dependencies of target CLING [ 76%] Built target CLING Scanning dependencies of target LLVMRES [ 76%] Copying LLVM resource and header files [ 76%] Built target LLVMRES (stucks…)

$ cmake --build . --install [ 0%] Built target AFTERIMAGE [ 0%] Built target OPENUI5 [ 0%] Built target LZMA [ 0%] Performing download step (download, verify and extract) for ‘VDT’ (stucks)

seems that vdt uses network access, which I do not have (not mentioned in the docs). I think I do not need it anyhow…

so again: $ rm -rf * $ cmake -Dclad=OFF -Dvdt=OFF -DCMAKE_INSTALL_PREFIX=…/root_install …/root-6.24.00 $ cmake --build . --install … [ 79%] Generating GThread.cxx, …/…/lib/Thread.pcm [ 79%] Generating Gforward_listDict.cxx, …/…/lib/libforward_listDict.rootmap [ 79%] Generating G__vectorDict.cxx, …/…/lib/libvectorDict.rootmap In file included from input_line_7:21: /srv/ussapc/home/ussapc/sw/root_build/include/ROOT/TReentrantRWLock.hxx:26:10: fatal error: ‘tbb/enumerable_thread_specific.h’ file not found

include “tbb/enumerable_thread_specific.h”

^~~~~~~~~~ Error: /srv/ussapc/home/ussapc/sw/root_build/core/rootcling_stage1/src/rootcling_stage1: compilation failure (/srv/ussapc/home/ussapc/sw/root_build/lib/libThreaddb2bde6cdd_dictUmbrella.h) gmake[2]: [core/thread/CMakeFiles/GThread.dir/build.make:109: core/thread/GThread.cxx] Error 1 gmake[1]: [CMakeFiles/Makefile2:27339: core/thread/CMakeFiles/G__Thread.dir/all] Error 2 gmake[1]: *** Waiting for unfinished jobs…

I found out that tbb is required by imt, so again $ rm -rf * $ cmake -Dclad=OFF -Dvdt=OFF -Dimt=OFF -DCMAKE_INSTALL_PREFIX=…/root_install …/root-6.24.00 $ cmake --build . – install -j 8 … [100%] Building CXX object roofit/roostats/CMakeFiles/RooStats.dir/src/ToyMCSampler.cxx.o [100%] Building CXX object roofit/roostats/CMakeFiles/RooStats.dir/src/ToyMCStudy.cxx.o [100%] Building CXX object roofit/roostats/CMakeFiles/RooStats.dir/src/UniformProposal.cxx.o [100%] Building CXX object roofit/roostats/CMakeFiles/RooStats.dir/src/UpperLimitMCSModule.cxx.o [100%] Linking CXX shared library …/…/lib/libRooStats.so [100%] Built target RooStats (stucks)

$ cmake --build . – install [ 0%] Built target OPENUI5 [ 0%] Performing download step (download, verify and extract) for ‘XROOTD’

unbeliveable… $ rm -rf * $ cmake -Dclad=OFF -Dvdt=OFF -Dimt=OFF -Dxrootd=OFF -DCMAKE_INSTALL_PREFIX=…/root_install …/root-6.24.00 $ cmake --build . – install -j 8

runs. Please fix at next release. I would recommend to implement a “localonly” option in case you don’t have internet access from the installation PC

Georg

_ROOT Version: 6.24.00 _Platform: RetHat 8.3 _Compiler:gcc 8.3.1-5

pamputt commented 3 years ago

I also got a server with RedHat 8.3 that is not connected to the Internet. I also went through all the steps described above. So I also strongly support the creation of a "localonly" option in CMake

pamputt commented 3 years ago

@georgtroska please note that you do not need to do -Dimt=OFF if tbb is already installed on your system.

Axel-Naumann commented 3 years ago

I wonder why CMake doesn't notice "no network access"; this sounds like a good feature for it to have...

I'd propose we do not offer builtins that require network access if we detect that no network is present, along the lines of https://stackoverflow.com/questions/62214621/how-to-check-for-internet-connection-with-cmake-automatically-prevent-fails-if

If you then do -Dfail-on-missing=On -Dimt and builtin-tbb gets turned off, and no system tbb is found, you'd get a nice error message. I find this more helpful than adding another config option.

Is that an acceptable approach?

pamputt commented 3 years ago

I wonder why CMake doesn't notice "no network access"; this sounds like a good feature for it to have...

I'd propose we do not offer builtins that require network access if we detect that no network is present, along the lines of https://stackoverflow.com/questions/62214621/how-to-check-for-internet-connection-with-cmake-automatically-prevent-fails-if

If you then do -Dfail-on-missing=On -Dimt and builtin-tbb gets turned off, and no system tbb is found, you'd get a nice error message. I find this more helpful than adding another config option.

Is that an acceptable approach?

Sounds good to me.

georgtroska commented 3 years ago

Hi, thank you very much, that you spend time on this!

I do not understand some things:

If you can't put out that packages from the default installation and if cmake can't detect that there is not valid connection and if you do not want to make an extra option please: write a chapter in the docs what you have to do in case you want to compile on a local system

Thanks Georg

bellenot commented 3 years ago

There is a ugly workaround to check for network:

execute_process(
  COMMAND ping www.github.com -n 2 -w 1000
  RESULT_VARIABLE NO_CONNECTION
)

And then use NO_CONNECTION like for example:

if(builtin_tbb)
  if(NO_CONNECTION EQUAL 1)
    if(fail-on-missing)
      message(FATAL_ERROR "No internet connection. Please check your connection, or either disable the 'builtin_tbb' option or the 'fail-on-missing' to automatically disable options requiring internet access")
    else()
      message(STATUS "No internet connection, disabling 'builtin_tbb' option")
      set(builtin_tbb OFF CACHE BOOL "Disabled because there no internet connection" FORCE)
      set(imt OFF CACHE BOOL "Disabled because 'builtin_tbb' was set but there no internet connection" FORCE)
    endif()
  else()
    ...

I quickly tried and the principle works, but I'll need time to make it working properly and that will complexify (again) CMake

georgtroska commented 3 years ago

Hi, It is easy to bring ping to work! But that does not necessarily mean that wget works. Georg Von meinem iPhone gesendet

Am 21.06.2021 um 12:13 schrieb Bertrand Bellenot @.***>:

 There is a ugly workaround to check for network:

execute_process( COMMAND ping www.github.com -n 2 -w 1000 RESULT_VARIABLE NO_CONNECTION ) And then use NO_CONNECTION like for example:

if(NO_CONNECTION EQUAL 1) message(STATUS "No internet connection, disabling 'builtin_tbb' option") set(builtin_tbb OFF CACHE BOOL "Disabled because there no internet connection" FORCE) set(imt OFF CACHE BOOL "Disabled because 'builtin_tbb' was set but there no internet connection" FORCE) else() ... I quickly tried and the principle works, but I'll need time to make it working properly and that will complexify (again) CMake

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

bellenot commented 3 years ago

@georgtroska you're right, but the issue says "Compilation error on redhat 8.3 / no internet" so I'm trying to solve the issue with "no internet". Am I missing something?

georgtroska commented 3 years ago

Hello Bertrand @bellenot I just checked for my case, indeed ping is not successfull. That means your suggestion would work for my case

I can just imagine that one could setup a firewall in such a way that ping is allowed but a wget or a git clone would be declined. (Ping works on ICMP, wget/git clone usually on TCP)

Maybe I am overcautious here, I would recommend to use "wget" as test instead of "ping"

Georg

bellenot commented 3 years ago

Thanks @georgtroska , then I'll try to come with a cross-platform solution (there is no wget on Windows)

bellenot commented 3 years ago

@georgtroska can you check https://github.com/root-project/root/pull/8520? Thanks

georgtroska commented 3 years ago

Seems no :-(

I downloaded https://github.com/bellenot/root/tree/check-internet-connection but I didn't expect that:

it hangs after git ~/sw/root-bld-test$ cmake -DCMAKE_INSALL_PREFIX=../root-inst-test ../root-check-internet-connection -- The C compiler identification is GNU 8.3.1 -- The CXX compiler identification is GNU 8.3.1 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.27.0") ^C

I retired the same command on the (6.24.00) here it workes

I'm sorry, there is still something wrong in the build system

Georg

bellenot commented 3 years ago

Yeah, sorry, I overlooked at the timeout parameter (100 seconds instead of 100 milliseconds). I'll commit the necessary changes (5 seconds) in a few minutes

georgtroska commented 3 years ago

Hi, Ok I went though your commits:

It seems that the default timeout for wget is 900s !!

Because pf this $ wget https://root.cern.ch/files/dummy.txt --2021-06-24 13:01:05-- https://root.cern.ch/files/dummy.txt Resolving root.cern.ch (root.cern.ch)... 137.138.18.236, 2001:1458:201:ee::100:6 Connecting to root.cern.ch (root.cern.ch)|137.138.18.236|:443... connected.

is bloody long time.

I would recommend: $ wget --timeout=10 https://root.cern.ch/files/dummy.txt --2021-06-24 13:04:20-- https://root.cern.ch/files/dummy.txt Resolving root.cern.ch (root.cern.ch)... 137.138.18.236, 2001:1458:201:ee::100:6 Connecting to root.cern.ch (root.cern.ch)|137.138.18.236|:443... connected. Unable to establish SSL connection.

An echo like " --Checking for internet connection" would be befinical too

Georg

bellenot commented 3 years ago

Done. please check the last commit

georgtroska commented 3 years ago

Hi, still not working: It detects a working connection:

t$ cmake -DCMAKE_INSALL_PREFIX=../root-inst-test ../root-check-internet-connection -- The C compiler identification is GNU 8.3.1 -- The CXX compiler identification is GNU 8.3.1 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.27.0") -- Checking internet connectivity... -- Yes

I'm trying this manually:

$ wget --timeout=10 https://root.cern.ch/files/dummy.txt --2021-06-24 13:20:37-- https://root.cern.ch/files/dummy.txt Resolving root.cern.ch (root.cern.ch)... 137.138.18.236, 2001:1458:201:ee::100:6 Connecting to root.cern.ch (root.cern.ch)|137.138.18.236|:443... connected. Unable to establish SSL connection. ussapc@warlv0010/~$ echo $? 4

man of wget says: EXIT STATUS Wget may return one of several error codes if it encounters problems.

   0   No problems occurred.
   1   Generic error code.
   2   Parse error---for instance, when parsing command-line options, the .wgetrc or .netrc...
   3   File I/O error.
   4   Network failure.
   5   SSL verification failure.
   6   Username/password authentication failure.
   7   Protocol errors.
   8   Server issued an error response.

I would recommend STATUS != 0 instead of STATUS=6

As you can see above name resolution is working, but the connection is blocked by the firewall

Georg

bellenot commented 3 years ago

OK, then I will change the code. Note that I have to physically download a file on the client to have a return code of 0, and then delete it. That's why I wanted to do it differently, but anyway, if it's needed then let's do it

bellenot commented 3 years ago

Done. Can you try again? (sorry but since I connect on the machine with ssh, I can't unplug it...)

bellenot commented 3 years ago

(btw CMake doesn't necessarily uses wget, so the return codes don't necessarily match)

georgtroska commented 3 years ago

Hello Bertrand @bellenot,

much better but now $ cmake --build . -- -j8

seems to hang at RIO???

[...] [ 81%] Built target setDict [ 81%] Built target unordered_setDict [ 81%] Built target multisetDict [ 81%] Built target unordered_multisetDict [ 81%] Built target root.exe [ 81%] Built target G__RIO [ 82%] Built target RIO

Georg

bellenot commented 3 years ago

Thanks @georgtroska , I'll check

Axel-Naumann commented 3 years ago

No, it built target RIO and hangs then. Two options: check with ps -feH to see what's happening "below" cmake, or press CtrlC and paste what cmake shows as canceled build steps.

bellenot commented 3 years ago

So at least clad is failing for me, but it doesn't hang after RIO. Can you try to disable it? (I didn't check the LLVM/Clang/Cling external projects...)

georgtroska commented 3 years ago

Hi, @bellenot, Hi @Axel-Naumann, sorry for the late reply,

so I run cmake on one terminal and while in hangs I run on an other ps -feH:

I think these are the interesting lines [...] ussapc 2103516 2102893 0 07:29 pts/0 00:00:00 cmake --build . -- ussapc 2103517 2103516 0 07:29 pts/0 00:00:00 /usr/bin/gmake ussapc 2103520 2103517 0 07:29 pts/0 00:00:00 /usr/bin/gmake -f CMakeFiles/Makefile2 all ussapc 2104639 2103520 0 07:29 pts/0 00:00:00 /usr/bin/gmake -f interpreter/cling/tools/plugins/clad/CMakeFiles/clad.dir/build.make interpreter/cling/tools/plugins/clad/ ussapc 2104641 2104639 0 07:29 pts/0 00:00:00 /usr/bin/cmake -P /srv/ussapc/home/ussapc/sw/root-bld-test/interpreter/cling/tools/plugins/clad/clad-prefix/src/clad-stam ussapc 2104642 2104641 0 07:29 pts/0 00:00:00 /usr/bin/cmake -P /srv/ussapc/home/ussapc/sw/root-bld-test/interpreter/cling/tools/plugins/clad/clad-prefix/tmp/clad-gi ussapc 2104644 2104642 0 07:29 pts/0 00:00:00 /usr/bin/git clone --origin origin https://github.com/vgvassilev/clad.git clad ussapc 2104645 2104644 0 07:29 pts/0 00:00:00 /usr/libexec/git-core/git-remote-https origin https://github.com/vgvassilev/clad.git [...]

seems the automatic downloads are hidden in a lot of places :-)

Georg

bellenot commented 3 years ago

Hi @georgtroska , did you try with the latest version (I disabled clad)? You might have to delete the interpreter/cling/tools/plugins/clad directory or start from scratch

georgtroska commented 3 years ago

no I used the same version as before.

I need to redownload the zip and start from scratch (can't do git fetch cause there is no internet :-) )

Georg

bellenot commented 3 years ago

@georgtroska note that deleting the clad directory and disabling the clad option should do it. But it would be nice to see if my PR really allows to build ROOT without internet connection ;-)

georgtroska commented 3 years ago

Hi, @bellenot, Hi @Axel-Naumann,

still not running, I downloaded the latest version from that branch and started from scratch, but it still tries to download clad

Georg

bellenot commented 3 years ago

@georgtroska OK, thanks. I'll try again and let you know

bellenot commented 3 years ago

@georgtroska So disabling clad was not enough, one should not add the clad subdirectory. It should be fixed now (at least it works on Windows). Please let me know, so I can merge the PR. Thanks

georgtroska commented 3 years ago

Hi @bellenot, Hi @Axel-Naumann,

:-(

[ 77%] Building CXX object core/rootcling_stage1/CMakeFiles/rootcling_stage1.dir/src/rootcling_stage1.cxx.o [ 77%] Linking CXX executable src/rootcling_stage1 [ 77%] Built target MetaCling Scanning dependencies of target Cling [ 77%] Linking CXX shared library ../../../lib/libCling.so /usr/bin/ld: cannot find -lcladPlugin /usr/bin/ld: cannot find -lcladDifferentiator collect2: error: ld returned 1 exit status gmake[2]: [core/metacling/src/CMakeFiles/Cling.dir/build.make:193: lib/libCling.so] Error 1 gmake[1]: [CMakeFiles/Makefile2:26939: core/metacling/src/CMakeFiles/Cling.dir/all] Error 2 gmake[1]: Waiting for unfinished jobs.... [ 77%] Built target rootcling_stage1 gmake: [Makefile:152: all] Error 2

please switch off clad on linker level, too

Georg

bellenot commented 3 years ago

I'll check, but that's weird, it works just fine on Windows...

bellenot commented 3 years ago

@georgtroska did you build from scratch?

georgtroska commented 3 years ago

@bellenot yes, rm -rf on build-dir

Georg

bellenot commented 3 years ago

@georgtroska OK, thanks

georgtroska commented 3 years ago

@bellenot I just had a brief look in the code. I think its because of:

[file: /core/metacling/src/CMakeLists.txt] [...] //# We need to paste the content of the cling plugins disabling link symbol optimizations. set(CLING_PLUGIN_LINK_LIBS) if (clad) if (APPLE) set(CLING_PLUGIN_LINK_LIBS -Wl,-force_load cladPlugin -Wl,-force_load cladDifferentiator) elseif(MSVC) set(CLING_PLUGIN_LINK_LIBS cladPlugin cladDifferentiator) set(CLAD_LIBS "-WHOLEARCHIVE:cladPlugin.lib -WHOLEARCHIVE:cladDifferentiator.lib") else() set(CLING_PLUGIN_LINK_LIBS -Wl,--whole-archive cladPlugin cladDifferentiator -Wl,--no-whole-archive) endif() endif() [..]

why don't you switch off clad completely when there is no internet (I think at the moment you are only checking if (clad && NO_INTERNET) )

CORRECTION: I checked your code... you did already

I never worked with CMake before.. if clad if off it shouldn't set these libs.. Might it be a matter of scope definition of the variable clad?? What I mean is: Is the variable clad the same as turned off because of no internet?

Georg

bellenot commented 3 years ago

I checked your code... you did already

I never worked with CMake before.. if clad if off it shouldn't set these libs.. Might it be a matter of scope definition of the variable clad?? What I mean is: Is the variable clad the same as turned off because of no internet?

I'm investigating, but since I work on Windows and ssh on Linux, it's difficult to disconnect it from internet ;-)

georgtroska commented 3 years ago

iptables....?

georgtroska commented 3 years ago

or simply delete the testfile on the server :-)

bellenot commented 3 years ago

well, I just set the flag in CMakeLists.txt

bellenot commented 3 years ago

@georgtroska I cannot reproduce the problem (i.e. it compiles fine on CentOS and Windows). Could you try to add message(STATUS "clad = ${clad}") in core/metacling/src/CMakeLists.txt, at line 93, as shown below:

# We need to paste the content of the cling plugins disabling link symbol optimizations.
set(CLING_PLUGIN_LINK_LIBS)
message(STATUS "clad = ${clad}")
if (clad)

and type cmake . in the build directory? You should see something like:

-- [...]
-- And then fallback to: 'c++'
-- clad = OFF
-- Performing Test found_stdstringview
-- [...]
georgtroska commented 3 years ago

@bellenot yes:

-- Cling version (from VERSION file): ROOT_1.0~dev -- Cling will look for C++ headers in '/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8:/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux:/usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward' at runtime. -- And then fallback to: 'c++' -- clad=OFF -- Performing Test found_stdstringview -- Performing Test found_stdstringview - Failed -- Performing Test found_stdexpstringview

bellenot commented 3 years ago

@georgtroska Good, but then I don't understand why you got the linker error... I'll try to investigate more

bellenot commented 3 years ago

@georgtroska I updated the PR with a slightly different test for clad, and added a protection for the linker (even if I don't understand why it still tries to link with clad=OFF...)

georgtroska commented 3 years ago

@bellenot: I started the recompilation..we will know it in ~35min :-)

georgtroska commented 3 years ago

@bellenot: Another one... [ 64%] Built target G__Netx [ 64%] Building CXX object net/netx/CMakeFiles/Netx.dir/src/TXNetFile.cxx.o /srv/ussapc/home/ussapc/sw/root-check-internet-connection/net/netx/src/TXNetFile.cxx:58:10: fatal error: XrdClient/XrdClient.hh: No such file or directory

include <XrdClient/XrdClient.hh>

      ^~~~~~~~~~~~~~~~~~~~~~~~

compilation terminated. gmake[2]: [net/netx/CMakeFiles/Netx.dir/build.make:63: net/netx/CMakeFiles/Netx.dir/src/TXNetFile.cxx.o] Error 1 gmake[1]: [CMakeFiles/Makefile2:33171: net/netx/CMakeFiles/Netx.dir/all] Error 2 gmake: *** [Makefile:152: all] Error 2

bellenot commented 3 years ago

@georgtroska thanks I'll check

bellenot commented 3 years ago

@georgtroska can you give me the result of cmake .? I.e. this line:

-- Enabled support for:  asimage builtin_afterimage builtin_clang builtin_cling builtin_ftgl builtin_gl2ps builtin_llvm builtin_lz4 builtin_lzma builtin_nlohmannjson builtin_openui5 builtin_pcre builtin_xxhash builtin_zstd exceptions http opengl pyroot rpath runtime_cxxmodules shared x11
georgtroska commented 3 years ago

@bellenot:

System Linux-4.18.0-240.1.1.el8_3.x86_64 Processor 8 core Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz (x86_64) Build type Release Install path /usr/local Compiler GNU 8.3.1 Compiler flags: C -Wno-implicit-fallthrough -pipe -Wall -W -pthread -O2 -DNDEBUG C++ -std=c++14 -Wno-implicit-fallthrough -Wno-noexcept-type -pipe -Wshadow -Wall -W -Woverloaded-virtual -fsigned-char -pthread -O2 -DNDEBUG Linker flags: Executable -rdynamic Module Shared -Wl,--no-undefined -Wl,--hash-style="both"

-- Enabled support for: asimage builtin_afterimage builtin_clang builtin_cling builtin_llvm builtin_lz4 builtin_lzma builtin_nlohmannjson builtin_openui5 built in_pcre builtin_xxhash builtin_zstd dataframe exceptions gdml http mlp roofit we bgui root7 runtime_cxxmodules shared ssl tmva spectrum x11 xrootd

Here you are, I'll be on leave until tuesday - have a nice weekend

bellenot commented 3 years ago

@georgtroska Thanks! Enjoy your long weekend!