root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.71k stars 1.29k forks source link

Hang with XRootD from eospublic on Debian Unstable #12231

Closed hahnjo closed 1 year ago

hahnjo commented 1 year ago

Starting from the debian:sid Docker image, create the following environment:

apt update && apt dist-upgrade
apt install cmake gcc g++ git libxrootd-client-dev ninja-build python3

Then clone root.git and configure + build with

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -Dx11=OFF ../root/
ninja

Afterwards try executing ./bin/root.exe tutorials/dataframe/df103_NanoAODHiggsAnalysis.C. It will hang and setting XRD_LOGLEVEL=Debug reveals:

[2023-02-06 12:00:28.136048 +0000][Debug  ][XRootDTransport   ] [eospublic.cern.ch:1094.0] Sending authentication data
[2023-02-06 12:00:28.137346 +0000][Debug  ][XRootDTransport   ] [eospublic.cern.ch:1094.0] Trying to authenticate using krb5
[2023-02-06 12:00:28.137406 +0000][Debug  ][XRootDTransport   ] [eospublic.cern.ch:1094.0] Cannot get credentials for protocol krb5: Seckrb5: No or invalid credentials; No credentials cache found (p=xrootd/eospublic.cern.ch@CERN.CH).
[2023-02-06 12:00:28.137968 +0000][Debug  ][XRootDTransport   ] [eospublic.cern.ch:1094.0] Trying to authenticate using gsi
[2023-02-06 12:00:32.761097 +0000][Debug  ][XRootDTransport   ] [eospublic.cern.ch:1094.0] Cannot get credentials for protocol gsi: Secgsi: ErrParseBuffer: unknown CA: cannot verify server certificate: kXGS_init

Instead, installing the xrootd-client package and running

xrdcp root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod_skimmed/SMHiggsToZZTo4L.root .

works just fine - the Debug log shows that it proceeds with Trying to authenticate using uni (after Cannot get credentials for protocol gsi was also signaled kind of immediately).

Axel-Naumann commented 1 year ago

@amadio do you think you could help us here with the right questions to ask?

amadio commented 1 year ago

What version of OpenSSL is being used, is it the builtin one? Have you installed the CERN SSL certificates? (note ErrParseBuffer: unknown CA: cannot verify server certificate in the error message). Do you have a user grid certificate setup? Is Kerberos actually installed? Do you have an active Kerberos ticket while running the test?

hahnjo commented 1 year ago

No, this is fully standard Debian Unstable without anything CERN specific. And as far as I can tell, the problem is not that krb5 and gsi but that XRootD should gracefully continue other authentication methods - or none at all, it's eospublic after all.

amadio commented 1 year ago

To understand why it's not trying, the build configuration is important. For instance, the unix authentication plugin won't be built if XRDCL_LIB_ONLY is true, which the builtin XRootD might be setting internally. I don't think this is a bug in XRootD, just a misconfiguration on the ROOT side. When I try from my machine, it goes via Kerberos authentication when I have a ticket.

hahnjo commented 1 year ago

The build configuration is the one from Debian; as I documented in the summary, I'm just installing the libxrootd-client-dev package. And as also mentioned, xrdcp works perfectly fine falling back to unix.

hahnjo commented 1 year ago

Okay, this is mostly a configuration problem from my side: I didn't install the libssl-dev package, so ROOT's configuration defaulted to builtin_openssl because ssl is ON but it couldn't find the OpenSSL headers. On Debian Unstable and Testing, this is a serious problem because it means we effectively end up with OpenSSL 1.1.1g (from the builtin, linked statically) and OpenSSL 3.0.8 (from the system, linked as a shared library) in one process. We are rather lucky that it doesn't blow up harder...

@bellenot do you think we should add a check to detect this configuration (xrootd AND NOT builtin_xrootd AND builtin_openssl) and emit a hard error? The tricky part is that this can end up being the automatic choice, as I witnessed...

bellenot commented 1 year ago

@bellenot do you think we should add a check to detect this configuration (xrootd AND NOT builtin_xrootd AND builtin_openssl) and emit a hard error? The tricky part is that this can end up being the automatic choice, as I witnessed...

As you wish, feel free to create a PR for this

hahnjo commented 1 year ago

As you wish, feel free to create a PR for this

The question is, how do we handle the default case where the build system tries to enable ssl and xrootd and then doesn't find OpenSSL, so it turns on builtin_openssl? We could silently disable either of ssl or xrootd, or force builtin_xrootd (which builds against builtin_openssl)...

bellenot commented 1 year ago

OK, then I'll try to find a solution...

amadio commented 1 year ago

Looks like my first guess was correct (builtin OpenSSL was being used). This does not look like a problem with XRootD, but with ROOT and the way it's handling builtins. Can this ticket be closed?

hahnjo commented 1 year ago

Looks like my first guess was correct (builtin OpenSSL was being used).

Well ok, I didn't understand your comment to say that using both builtin_ssl and an XRootD linked against a system OpenSSL is a problem...

This does not look like a problem with XRootD, but with ROOT and the way it's handling builtins. Can this ticket be closed?

No, it's not a problem of XRootD, but I do think we should prevent users from shooting themselves into the foot with a "broken" ROOT configuration, see above.

bellenot commented 1 year ago

@hahnjo there is a PR for this issue. Can you maybe try it and let me know if that fixes the issue?

hahnjo commented 1 year ago

Can you maybe try it and let me know if that fixes the issue?

Using the setup described above, I can confirm the PR detects the situation and switches off xrootd :+1:

bellenot commented 1 year ago

Thanks @hahnjo!

github-actions[bot] commented 1 year ago

Hi @bellenot,

It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.

Sincerely, :robot:

github-actions[bot] commented 1 year ago

Hi @bellenot,

It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.

Sincerely, :robot: