root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.72k stars 1.29k forks source link

xrootd file open on the grid sometimes fail with status code 139 #6948

Closed rdschaffer closed 3 years ago

rdschaffer commented 3 years ago

Hi there,

Running root-based reading analysis jobs in ATLAS, we are having problems trying to understand why some jobs fail on certain sites at file open when reading remote files with xrootd. We are using ROOT version 6.18/04. (I don't think that we have problems with 6.16/00, and a few tests indicate that 6.20/06 also had this problem.)

What we see is that for a file open:

    std::unique_ptr< TFile > ifile( TFile::Open( file.c_str(), "READ" ) );

on a grid site node, the job exits with status code 139, which I believe is SIGURG - Urgent condition on socket (4.2BSD). The status code from TApplication::HandleException is 128 + root enum, and 11 is kSigUrgent. See: https://root.cern.ch/doc/master/TApplication_8cxx_source.html#l00602 https://root.cern.ch/doc/master/TSysEvtHandler_8h_source.html#l00107

Running the same program interactively on the same file works fine. And it seems that only some sites with remote reading are failing. So we would like to ask for help in trying to track this down.

Currently, there is no stack trace to help understand things, and a simple 'print' just after TFile::Open is not printed.

I tried to add:

gApplication->ExitOnException( TApplication::kDontExit );

thinking that https://root.cern.ch/doc/master/TApplication_8cxx_source.html#l00602

void TApplication::HandleException might throw an exception, but this does not work.

So suggestions would be welcome. Is there a way to get a stack trace or more information on what is going on in the I/O part of this file open?

I don't know how to add in watchers for people in ATLAS, or a mailing list. But I did find @krasznaa.

                          thanks much, RD
krasznaa commented 3 years ago

Unfortunately this is the sort of issue that could have been easier to track/discuss on JIRA. But since ROOT doesn't use that anymore, here we go...

My suspicion is that the grid nodes in question put some locally installed XRootD version high up in the library search path of the jobs. I don't know how they would do that, but that's my educated guess.

ATLAS analysis releases using ROOT 6.18/04 (https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.65/External/ROOT/CMakeLists.txt) use XRootD 4.10.0 (https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.65/External/XRootD/CMakeLists.txt). While releases using ROOT 6.16/00 (https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.60/External/ROOT/CMakeLists.txt) used XRootD 4.8.4 (https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.60/External/XRootD/CMakeLists.txt). My educated guess is that the XRootD version force fed into your jobs @rdschaffer is binary compatible with XRootD 4.8.4, but not with 4.10.0 (or newer).

However we definitely need some follow up from our grid experts on this. @rodwalker would it be possible to look at the problematic jobs / grid nodes for this?

Cheers, Attila

rodwalker commented 3 years ago

Hi, The whole ENV is dumped in eg.

https://bigpanda.cern.ch//media/filebrowser/5e40cf5d-179e-4126-ad56-e0bb0173cbd5/panda/tarball_PandaJob_4911855304_CERN/payload.stdout

Does that give any clue?

LD_LIBRARY_PATH=/srv/workDir/usr/HZZAnalRun2Code/1.0.0/InstallArea/x86_64-centos7-gcc8-opt/lib:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBase/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib64:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64:/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib:/.singularity.d/libs

Cheers, Rod.

On Mon, 14 Dec 2020 at 09:57, Attila Krasznahorkay notifications@github.com wrote:

Unfortunately this is the sort of issue that could have been easier to track/discuss on JIRA. But since ROOT doesn't use that anymore, here we go...

My suspicion is that the grid nodes in question put some locally installed XRootD version high up in the library search path of the jobs. I don't know how they would do that, but that's my educated guess.

ATLAS analysis releases using ROOT 6.18/04 ( https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.65/External/ROOT/CMakeLists.txt) use XRootD 4.10.0 ( https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.65/External/XRootD/CMakeLists.txt). While releases using ROOT 6.16/00 ( https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.60/External/ROOT/CMakeLists.txt) used XRootD 4.8.4 ( https://gitlab.cern.ch/atlas/atlasexternals/-/blob/1.0.60/External/XRootD/CMakeLists.txt). My educated guess is that the XRootD version force fed into your jobs @rdschaffer https://github.com/rdschaffer is binary compatible with XRootD 4.8.4, but not with 4.10.0 (or newer).

However we definitely need some follow up from our grid experts on this. @rodwalker https://github.com/rodwalker would it be possible to look at the problematic jobs / grid nodes for this?

Cheers, Attila

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/root-project/root/issues/6948#issuecomment-744287499, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNFVRUVHO6ZGSG5ZIJCI73SUXHN3ANCNFSM4U2MLUJA .

-- Tel. +49 89 289 14152

krasznaa commented 3 years ago

Hi Rod,

What does

LD_PRELOAD=/srv/workDir/96340ef3-75b1-46cf-8910-8a2f76b7068c/$LIB/wrapper.so

do? That would be my first suspect. Since $LD_LIBRARY_PATH lists our software directories in the correct order, based on just that XRootD should be found under:

[bash][thor]:~ > ls -l /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrd*
lrwxrwxrwx 1 cvmfs cvmfs      19 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so -> libXrdAppUtils.so.1
lrwxrwxrwx 1 cvmfs cvmfs      23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so.1 -> libXrdAppUtils.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   74512 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   18432 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBlacklistDecision-4.so
-rwxr-xr-x 1 cvmfs cvmfs   82136 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBwm-4.so
-rwxr-xr-x 1 cvmfs cvmfs   13552 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCksCalczcrc32-4.so
lrwxrwxrwx 1 cvmfs cvmfs      17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so -> libXrdClient.so.2
lrwxrwxrwx 1 cvmfs cvmfs      21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2 -> libXrdClient.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  663320 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs   42096 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClProxyPlugin-4.so
lrwxrwxrwx 1 cvmfs cvmfs      13 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so -> libXrdCl.so.2
lrwxrwxrwx 1 cvmfs cvmfs      17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2 -> libXrdCl.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs 1416944 Sep 10 03:20 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2.0.0
lrwxrwxrwx 1 cvmfs cvmfs      21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so -> libXrdCryptoLite.so.1
lrwxrwxrwx 1 cvmfs cvmfs      25 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so.1 -> libXrdCryptoLite.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   13632 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so.1.0.0
lrwxrwxrwx 1 cvmfs cvmfs      17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so -> libXrdCrypto.so.1
lrwxrwxrwx 1 cvmfs cvmfs      21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so.1 -> libXrdCrypto.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs  129112 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs  222064 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptossl-4.so
lrwxrwxrwx 1 cvmfs cvmfs      14 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so -> libXrdFfs.so.2
lrwxrwxrwx 1 cvmfs cvmfs      18 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so.2 -> libXrdFfs.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs   65152 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  271416 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFileCache-4.so
-rwxr-xr-x 1 cvmfs cvmfs   13104 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttp-4.so
-rwxr-xr-x 1 cvmfs cvmfs  115880 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpTPC-4.so
lrwxrwxrwx 1 cvmfs cvmfs      20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so -> libXrdHttpUtils.so.1
lrwxrwxrwx 1 cvmfs cvmfs      24 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so.1 -> libXrdHttpUtils.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs  206640 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   18824 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdN2No2p-4.so
-rwxr-xr-x 1 cvmfs cvmfs   13304 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdOssSIgpfsT-4.so
lrwxrwxrwx 1 cvmfs cvmfs      23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so -> libXrdPosixPreload.so.1
lrwxrwxrwx 1 cvmfs cvmfs      27 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so.1 -> libXrdPosixPreload.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   87568 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so.1.0.0
lrwxrwxrwx 1 cvmfs cvmfs      16 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so -> libXrdPosix.so.2
lrwxrwxrwx 1 cvmfs cvmfs      20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so.2 -> libXrdPosix.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  195944 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs 1001552 Sep 10 03:26 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdProofd.so
-rwxr-xr-x 1 cvmfs cvmfs   83216 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPss-4.so
-rwxr-xr-x 1 cvmfs cvmfs   70544 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSec-4.so
-rwxr-xr-x 1 cvmfs cvmfs  220600 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsi-4.so
-rwxr-xr-x 1 cvmfs cvmfs   19480 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiAUTHZVO-4.so
-rwxr-xr-x 1 cvmfs cvmfs   23808 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiGMAPDN-4.so
-rwxr-xr-x 1 cvmfs cvmfs   53384 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSeckrb5-4.so
-rwxr-xr-x 1 cvmfs cvmfs   25152 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecProt-4.so
-rwxr-xr-x 1 cvmfs cvmfs  142864 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecpwd-4.so
-rwxr-xr-x 1 cvmfs cvmfs   45192 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecsss-4.so
-rwxr-xr-x 1 cvmfs cvmfs   19320 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecunix-4.so
lrwxrwxrwx 1 cvmfs cvmfs      17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so -> libXrdServer.so.2
lrwxrwxrwx 1 cvmfs cvmfs      21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so.2 -> libXrdServer.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs 1040472 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  134808 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsi-4.so
lrwxrwxrwx 1 cvmfs cvmfs      17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so -> libXrdSsiLib.so.1
lrwxrwxrwx 1 cvmfs cvmfs      21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so.1 -> libXrdSsiLib.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs  161352 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   18544 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLog-4.so
lrwxrwxrwx 1 cvmfs cvmfs      19 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so -> libXrdSsiShMap.so.1
lrwxrwxrwx 1 cvmfs cvmfs      23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so.1 -> libXrdSsiShMap.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   39624 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so.1.0.0
-rwxr-xr-x 1 cvmfs cvmfs   76664 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdThrottle-4.so
lrwxrwxrwx 1 cvmfs cvmfs      16 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so -> libXrdUtils.so.2
lrwxrwxrwx 1 cvmfs cvmfs      20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2 -> libXrdUtils.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  763032 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2.0.0
lrwxrwxrwx 1 cvmfs cvmfs      14 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so -> libXrdXml.so.2
lrwxrwxrwx 1 cvmfs cvmfs      18 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2 -> libXrdXml.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs  122928 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2.0.0
-rwxr-xr-x 1 cvmfs cvmfs   13104 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXrootd-4.so
[bash][thor]:~ >

Do you know what that preload is (supposed to be) doing exactly?

Cheers, Attila

rodwalker commented 3 years ago

Hi, It is overloading some network related commands to provide a record of what users are remote accessing. It creates https://bigpanda.cern.ch//media/filebrowser/5e40cf5d-179e-4126-ad56-e0bb0173cbd5/panda/tarball_PandaJob_4911855304_CERN/pandatracerlog.txt

2020-12-04 18:55:07.949713 : INFO connect: ::2001:1458:301:62:0:0:1094 cmd: runH4lAnalRun2

where IPv6 always rings alarm bells with me. This would be a node/site,RSE dependence.

Cheers,

Rod.

On Mon, 14 Dec 2020 at 10:31, Attila Krasznahorkay notifications@github.com wrote:

Hi Rod,

What does

LD_PRELOAD=/srv/workDir/96340ef3-75b1-46cf-8910-8a2f76b7068c/$LIB/wrapper.so

do? That would be my first suspect. Since $LD_LIBRARY_PATH lists our software directories in the correct order, based on just that XRootD should be found under:

[bash][thor]:~ > ls -l /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrd* lrwxrwxrwx 1 cvmfs cvmfs 19 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so -> libXrdAppUtils.so.1 lrwxrwxrwx 1 cvmfs cvmfs 23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so.1 -> libXrdAppUtils.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 74512 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so.1.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdAppUtils.so.1.0.0-rwxr-xr-x 1 cvmfs cvmfs 18432 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBlacklistDecision-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBlacklistDecision-4.so-rwxr-xr-x 1 cvmfs cvmfs 82136 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBwm-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdBwm-4.so-rwxr-xr-x 1 cvmfs cvmfs 13552 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCksCalczcrc32-4.so lrwxrwxrwx 1 cvmfs cvmfs 17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so -> libXrdClient.so.2 lrwxrwxrwx 1 cvmfs cvmfs 21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2 -> libXrdClient.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 663320 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2.0.0-rwxr-xr-x 1 cvmfs cvmfs 42096 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClProxyPlugin-4.so lrwxrwxrwx 1 cvmfs cvmfs 13 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so -> libXrdCl.so.2 lrwxrwxrwx 1 cvmfs cvmfs 17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2 -> libXrdCl.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 1416944 Sep 10 03:20 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2.0.0 lrwxrwxrwx 1 cvmfs cvmfs 21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so -> libXrdCryptoLite.so.1 lrwxrwxrwx 1 cvmfs cvmfs 25 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so.1 -> libXrdCryptoLite.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 13632 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptoLite.so.1.0.0 lrwxrwxrwx 1 cvmfs cvmfs 17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so -> libXrdCrypto.so.1 lrwxrwxrwx 1 cvmfs cvmfs 21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so.1 -> libXrdCrypto.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 129112 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so.1.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCrypto.so.1.0.0-rwxr-xr-x 1 cvmfs cvmfs 222064 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCryptossl-4.so lrwxrwxrwx 1 cvmfs cvmfs 14 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so -> libXrdFfs.so.2 lrwxrwxrwx 1 cvmfs cvmfs 18 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so.2 -> libXrdFfs.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 65152 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so.2.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFfs.so.2.0.0-rwxr-xr-x 1 cvmfs cvmfs 271416 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFileCache-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdFileCache-4.so-rwxr-xr-x 1 cvmfs cvmfs 13104 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttp-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttp-4.so-rwxr-xr-x 1 cvmfs cvmfs 115880 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpTPC-4.so lrwxrwxrwx 1 cvmfs cvmfs 20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so -> libXrdHttpUtils.so.1 lrwxrwxrwx 1 cvmfs cvmfs 24 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so.1 -> libXrdHttpUtils.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 206640 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so.1.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdHttpUtils.so.1.0.0-rwxr-xr-x 1 cvmfs cvmfs 18824 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdN2No2p-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdN2No2p-4.so-rwxr-xr-x 1 cvmfs cvmfs 13304 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdOssSIgpfsT-4.so lrwxrwxrwx 1 cvmfs cvmfs 23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so -> libXrdPosixPreload.so.1 lrwxrwxrwx 1 cvmfs cvmfs 27 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so.1 -> libXrdPosixPreload.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 87568 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosixPreload.so.1.0.0 lrwxrwxrwx 1 cvmfs cvmfs 16 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so -> libXrdPosix.so.2 lrwxrwxrwx 1 cvmfs cvmfs 20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so.2 -> libXrdPosix.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 195944 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so.2.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPosix.so.2.0.0-rwxr-xr-x 1 cvmfs cvmfs 1001552 Sep 10 03:26 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdProofd.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdProofd.so-rwxr-xr-x 1 cvmfs cvmfs 83216 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPss-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdPss-4.so-rwxr-xr-x 1 cvmfs cvmfs 70544 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSec-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSec-4.so-rwxr-xr-x 1 cvmfs cvmfs 220600 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsi-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsi-4.so-rwxr-xr-x 1 cvmfs cvmfs 19480 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiAUTHZVO-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiAUTHZVO-4.so-rwxr-xr-x 1 cvmfs cvmfs 23808 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiGMAPDN-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecgsiGMAPDN-4.so-rwxr-xr-x 1 cvmfs cvmfs 53384 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSeckrb5-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSeckrb5-4.so-rwxr-xr-x 1 cvmfs cvmfs 25152 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecProt-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecProt-4.so-rwxr-xr-x 1 cvmfs cvmfs 142864 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecpwd-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecpwd-4.so-rwxr-xr-x 1 cvmfs cvmfs 45192 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecsss-4.so -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecsss-4.so-rwxr-xr-x 1 cvmfs cvmfs 19320 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSecunix-4.so lrwxrwxrwx 1 cvmfs cvmfs 17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so -> libXrdServer.so.2 lrwxrwxrwx 1 cvmfs cvmfs 21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so.2 -> libXrdServer.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 1040472 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so.2.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdServer.so.2.0.0-rwxr-xr-x 1 cvmfs cvmfs 134808 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsi-4.so lrwxrwxrwx 1 cvmfs cvmfs 17 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so -> libXrdSsiLib.so.1 lrwxrwxrwx 1 cvmfs cvmfs 21 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so.1 -> libXrdSsiLib.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 161352 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so.1.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLib.so.1.0.0-rwxr-xr-x 1 cvmfs cvmfs 18544 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiLog-4.so lrwxrwxrwx 1 cvmfs cvmfs 19 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so -> libXrdSsiShMap.so.1 lrwxrwxrwx 1 cvmfs cvmfs 23 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so.1 -> libXrdSsiShMap.so.1.0.0 -rwxr-xr-x 1 cvmfs cvmfs 39624 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so.1.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdSsiShMap.so.1.0.0-rwxr-xr-x 1 cvmfs cvmfs 76664 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdThrottle-4.so lrwxrwxrwx 1 cvmfs cvmfs 16 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so -> libXrdUtils.so.2 lrwxrwxrwx 1 cvmfs cvmfs 20 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2 -> libXrdUtils.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 763032 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2.0.0 lrwxrwxrwx 1 cvmfs cvmfs 14 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so -> libXrdXml.so.2 lrwxrwxrwx 1 cvmfs cvmfs 18 Sep 10 13:12 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2 -> libXrdXml.so.2.0.0 -rwxr-xr-x 1 cvmfs cvmfs 122928 Sep 10 03:19 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2.0.0 -rwxr-xr-x http://atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2.0.0-rwxr-xr-x 1 cvmfs cvmfs 13104 Sep 10 03:21 /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXrootd-4.so [bash][thor]:~ >

Do you know what that preload is (supposed to be) doing exactly?

Cheers, Attila

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/root-project/root/issues/6948#issuecomment-744308661, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNFVRQW3GJAVEWH5EBEGX3SUXLNXANCNFSM4U2MLUJA .

-- Tel. +49 89 289 14152

rdschaffer commented 3 years ago

Hey @Axel-Naumann,

Have you found a moment to have a look at this?

    see you, RD
Axel-Naumann commented 3 years ago

@simonmichal would you have a recommendation what to look at?

simonmichal commented 3 years ago

Well, I doubt there are some out-of-band data being sent/received. @rodwalker, @rdschaffer would it be possible to reproduce the problem with xrootd client logs switched on (XRD_LOGLEVEL=Dump)?

Regarding ABI compatibility, we ensure ABI forward compatibility, meaning that it is safe to link an application built with an older version of xrootd, with a newer version of the library (e.g. one can build his application with say 4.11.0 and then link with 4.12.0). The opposite is not possible. Of course this applies to all releases from 4.x.x series, the ABI has been broken when we moved to XRootD5.

rdschaffer commented 3 years ago

OK, I ran with XRD_LOGLEVEL=Dump, and you can see the response after

=== stderr ===

saying:

Unable to process directory /alrb/.xrootd/client.plugins.d: [ERROR] OS Error: No such file or directory

Log file:

xrootd_error_on_grid.pdf

The file:

root://marsedpm.in2p3.fr:1094//dpm/in2p3.fr/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/ab/DAOD_HIGG2D1.23315577._000001.pool.root.1

of course opens correctly for a simple TOpen in any interactive ROOT session.

          see you, RD
rdschaffer commented 3 years ago

The above is running in Marseilles: CCIN2P3-CCPM.

Another for reading from eos from the CERN-T0 facility:

xrootd_error_on_grid_CERN_T0.pdf

simonmichal commented 3 years ago

Is it possible to determine the exact version of xrootd client that is being used? Unfortunately, the crash happens before the client logs in so I cannot see it from logs. The server reported protocol version 500 in the xrootd handshake.

rdschaffer commented 3 years ago

Does this help:

2020-12-16 12:22:18,612 | INFO | Thread-1 | gfal2 | connect | [gfal_module_load] plugin /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix1/usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so loaded with success

rdschaffer commented 3 years ago

Marseilles job logs are in:

marseilles

and Cern jobs logs are in: Cern

rodwalker commented 3 years ago

Hi, Submit a job with compiled C to just open the Marseille file (code at bottom)

https://bigpanda.cern.ch/job?pandaid=4923453571

It has the same release, and it works! I am not sure if anything else is different, but it points at the specific code rather than a pure TFile open problem.

Cheers, Rod.

$ cat main.C

include

include

include "TFile.h"

using namespace std;

int main() { TFile* davixFile = TFile::Open("root:// eosatlas.cern.ch:1094//eos/atlas/atlasdatadisk/rucio/mc16_13TeV/25/31/DAOD_HIGG2D1.23315648._000001.pool.root.1 ","READ"); cout << "coucou 5" << endl; davixFile->ls(); davixFile->Close();

return 0; }

On Wed, 16 Dec 2020 at 15:50, rdschaffer notifications@github.com wrote:

Marseilles job logs are in:

marseilles https://bigpanda.cern.ch/filebrowser/?guid=00354dec-89f9-4687-bc9e-d0151ddff358&lfn=panda.um.group.phys-higgs.user.schaffer.mc16_13TeV.500995.H4lMinitree_nominal.0.16e..201216_01.log.23578674.000051.log.tgz&site=IN2P3-CPPM/SCORE&scope=panda&fileid=23156311480

and Cern jobs logs are in: Cern https://bigpanda.cern.ch/filebrowser/?guid=52428b18-b810-4194-be8a-fb11e92bc4f8&lfn=panda.um.group.phys-higgs.user.schaffer.mc16_13TeV.500995.H4lMinitree_nominal.0.16e..201216_01.log.23578674.000050.log.tgz&site=CERN-T0/SCORE&scope=panda&fileid=23156311459

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/root-project/root/issues/6948#issuecomment-746412091, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNFVRTBK3POEBIWCBST673SVDCJNANCNFSM4U2MLUJA .

-- Tel. +49 89 289 14152

krasznaa commented 3 years ago

Hi Rod,

:confused: So, how did you compile that code exactly? Just g++ main.cxx, right?

In that case XRootD would be picked up from /usr. Which doesn't tell us much about our problem. Since RD's test job will pick up XRootD from:

/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/

This is why I said at the beginning, that I'm suspicious about the LD_PRELOAD setting. If that library wants to use XRootD, but it was compiled against a different version of XRootD than what the analysis release comes with, then we're in trouble. Note that all ATLAS releases come with their own version of XRootD, not just the analysis releases. So any grid node setup that wants to force one particular version of XRootD on the job, will give us a really bad time...

Best, Attila

rodwalker commented 3 years ago

Right, forgot to mention I dont know what Im doing with C. A colleague gave me g++ $(root-config --cflags --libs) -o main main.C which I`m doing after the asetup of the same release. Does ldd answer the question?

$ ldd main.mars | grep -i root libROOTVecOps.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTVecOps.so (0x00007f8b07f3d000) libROOTDataFrame.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTDataFrame.so (0x00007f8b062af000) libROOTNTuple.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTNTuple.so (0x00007f8b04041000)

I don`t think the LD_PRELOAD has any xroot stuff - it is at a lower level to get network ops (hton or whatever). The code is from

wget http://pandaserver.cern.ch:25085/trf/user/runGen-00-00-02 chmod u+x runGen-00-00-02 ./runGen-00-00-02 less pandawnutil/tracer/wrapper.c

Cheers, Rod.

$ ldd main.mars linux-vdso.so.1 => (0x00007fff34109000) libCore.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libCore.so (0x00007f09e6284000) libImt.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libImt.so (0x00007f09e6077000) libRIO.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libRIO.so (0x00007f09e5adb000) libNet.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libNet.so (0x00007f09e57fc000) libHist.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libHist.so (0x00007f09e5210000) libGraf.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libGraf.so (0x00007f09e4e22000) libGraf3d.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libGraf3d.so (0x00007f09e4b71000) libGpad.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libGpad.so (0x00007f09e488a000) libROOTVecOps.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTVecOps.so (0x00007f09e45b2000) libTree.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libTree.so (0x00007f09e4233000) libTreePlayer.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libTreePlayer.so (0x00007f09e3eae000) libRint.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libRint.so (0x00007f09e3c85000) libPostscript.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libPostscript.so (0x00007f09e3a0d000) libMatrix.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libMatrix.so (0x00007f09e3695000) libPhysics.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libPhysics.so (0x00007f09e3448000) libMathCore.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libMathCore.so (0x00007f09e3037000) libThread.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libThread.so (0x00007f09e2de4000) libMultiProc.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libMultiProc.so (0x00007f09e2bd7000) libROOTDataFrame.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTDataFrame.so (0x00007f09e2924000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f09e2720000) libstdc++.so.6 => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64/libstdc++.so.6 (0x00007f09e2397000) libm.so.6 => /lib64/libm.so.6 (0x00007f09e2095000) libgcc_s.so.1 => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64/libgcc_s.so.1 (0x00007f09e1e7d000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f09e1c61000) libc.so.6 => /lib64/libc.so.6 (0x00007f09e1893000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f09e1631000) libz.so.1 => /lib64/libz.so.1 (0x00007f09e141b000) /lib64/ld-linux-x86-64.so.2 (0x00007f09e6946000) libtbb.so.2 => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libtbb.so.2 (0x00007f09e11db000) libssl.so.10 => /lib64/libssl.so.10 (0x00007f09e0f69000) libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007f09e0b06000) libvdt.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libvdt.so (0x00007f09e08fe000) libROOTNTuple.so => /cvmfs/ atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libROOTNTuple.so (0x00007f09e06b6000) librt.so.1 => /lib64/librt.so.1 (0x00007f09e04ae000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f09e0261000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f09dff78000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f09dfd74000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f09dfb41000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f09df931000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f09df72d000) libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f09df513000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f09df2ec000)

On Wed, 16 Dec 2020 at 16:08, Attila Krasznahorkay notifications@github.com wrote:

Hi Rod,

😕 So, how did you compile that code exactly? Just g++ main.cxx, right?

In that case XRootD would be picked up from /usr. Which doesn't tell us much about our problem. Since RD's test job will pick up XRootD from:

/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/

This is why I said at the beginning, that I'm suspicious about the LD_PRELOAD setting. If that library wants to use XRootD, but it was compiled against a different version of XRootD than what the analysis release comes with, then we're in trouble. Note that all ATLAS releases come with their own version of XRootD, not just the analysis releases. So any grid node setup that wants to force one particular version of XRootD on the job, will give us a really bad time...

Best, Attila

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/root-project/root/issues/6948#issuecomment-746441588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNFVRWZO2A6S5ZA3H35VYDSVDENRANCNFSM4U2MLUJA .

-- Tel. +49 89 289 14152

krasznaa commented 3 years ago

Hmm... That in principle looks fine... So okay, your test job is relevant.

Unfortunately I'm running out of ideas. The XRootD build in AnalysisBaseExternals does depend on a couple of libraries from the OS. But these should only be things that are part of HEP_OSlibs. So the worker nodes should not really have different versions of them...

[bash][lxplus730]:~ > ldd -r /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrd*.so | grep " /lib" | sed "s/\(.*\) (0x.*)/\1/g" | sort | uniq 
    libc.so.6 => /lib64/libc.so.6
    libcom_err.so.2 => /lib64/libcom_err.so.2
    libcrypt.so.1 => /lib64/libcrypt.so.1
    libcrypto.so.10 => /lib64/libcrypto.so.10
    libcurl.so.4 => /lib64/libcurl.so.4
    libdl.so.2 => /lib64/libdl.so.2
    libfreebl3.so => /lib64/libfreebl3.so
    libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2
    libidn.so.11 => /lib64/libidn.so.11
    libk5crypto.so.3 => /lib64/libk5crypto.so.3
    libkeyutils.so.1 => /lib64/libkeyutils.so.1
    libkrb5.so.3 => /lib64/libkrb5.so.3
    libkrb5support.so.0 => /lib64/libkrb5support.so.0
    liblber-2.4.so.2 => /lib64/liblber-2.4.so.2
    libldap-2.4.so.2 => /lib64/libldap-2.4.so.2
    libm.so.6 => /lib64/libm.so.6
    libnspr4.so => /lib64/libnspr4.so
    libnss3.so => /lib64/libnss3.so
    libnssutil3.so => /lib64/libnssutil3.so
    libpcre.so.1 => /lib64/libpcre.so.1
    libplc4.so => /lib64/libplc4.so
    libplds4.so => /lib64/libplds4.so
    libpthread.so.0 => /lib64/libpthread.so.0
    libresolv.so.2 => /lib64/libresolv.so.2
    librt.so.1 => /lib64/librt.so.1
    libsasl2.so.3 => /lib64/libsasl2.so.3
    libselinux.so.1 => /lib64/libselinux.so.1
    libsmime3.so => /lib64/libsmime3.so
    libssh2.so.1 => /lib64/libssh2.so.1
    libssl.so.10 => /lib64/libssl.so.10
    libssl3.so => /lib64/libssl3.so
    libz.so.1 => /lib64/libz.so.1
[bash][lxplus730]:~ >

Could the version of some of these not be "well defined" on the grid nodes?

rodwalker commented 3 years ago

RD says it is line 1244 that causes a sigsegv

https://gitlab.cern.ch/HZZ/HZZSoftware/HZZAnalRun2Code/-/blob/changes-for-v25-fJVT/H4lAnalysisRun2/Root/H4lAnalRun2Init.cxx

At least nothing of the ATH_MSG_ERROR in the subsequent lines makes it into the log.

I`m not sure how close I can get to that in my test.

Cheers, Rod.

On Wed, 16 Dec 2020 at 16:50, Attila Krasznahorkay notifications@github.com wrote:

Hmm... That in principle looks fine... So okay, your test job is relevant.

Unfortunately I'm running out of ideas. The XRootD build in AnalysisBaseExternals does depend on a couple of libraries from the OS. But these should only be things that are part of HEP_OSlibs. So the worker nodes should not really have different versions of them...

[bash][lxplus730]:~ > ldd -r /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrd.so | grep " /lib" | sed "s/(.) (0x.*)/\1/g" | sort | uniq libc.so.6 => /lib64/libc.so.6 libcom_err.so.2 => /lib64/libcom_err.so.2 libcrypt.so.1 => /lib64/libcrypt.so.1 libcrypto.so.10 => /lib64/libcrypto.so.10 libcurl.so.4 => /lib64/libcurl.so.4 libdl.so.2 => /lib64/libdl.so.2 libfreebl3.so => /lib64/libfreebl3.so libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 libidn.so.11 => /lib64/libidn.so.11 libk5crypto.so.3 => /lib64/libk5crypto.so.3 libkeyutils.so.1 => /lib64/libkeyutils.so.1 libkrb5.so.3 => /lib64/libkrb5.so.3 libkrb5support.so.0 => /lib64/libkrb5support.so.0 liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 libm.so.6 => /lib64/libm.so.6 libnspr4.so => /lib64/libnspr4.so libnss3.so => /lib64/libnss3.so libnssutil3.so => /lib64/libnssutil3.so libpcre.so.1 => /lib64/libpcre.so.1 libplc4.so => /lib64/libplc4.so libplds4.so => /lib64/libplds4.so libpthread.so.0 => /lib64/libpthread.so.0 libresolv.so.2 => /lib64/libresolv.so.2 librt.so.1 => /lib64/librt.so.1 libsasl2.so.3 => /lib64/libsasl2.so.3 libselinux.so.1 => /lib64/libselinux.so.1 libsmime3.so => /lib64/libsmime3.so libssh2.so.1 => /lib64/libssh2.so.1 libssl.so.10 => /lib64/libssl.so.10 libssl3.so => /lib64/libssl3.so libz.so.1 => /lib64/libz.so.1 [bash][lxplus730]:~ >

Could the version of some of these not be "well defined" on the grid nodes?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/root-project/root/issues/6948#issuecomment-746514897, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNFVRUAMSYVG52Y6VA3CU3SVDJNFANCNFSM4U2MLUJA .

-- Tel. +49 89 289 14152

rdschaffer commented 3 years ago

Hi Rod,

Well, I added a 'print' before and after 1244 in the current jobs - didn't check it it. So this looks like:

        ATH_MSG_INFO( "processEvents: try to open file: " << file );

        std::unique_ptr< TFile > ifile( TFile::Open( file.c_str(), "READ" ) );

        ATH_MSG_INFO( "processEvents: called TFile Open " );

and in the log, one sees:

`H4lAnalRun2 INFO processEvents: try to open file: root://eosatlas.cern.ch:1094//eos/atlas/atlasdatadisk/rucio/mc16_13TeV/25/31/DAOD_HIGG2D1.23315648._000001.pool.root.1

=== stderr === [2020-12-16 13:29:01.003032 +0100][Debug ][Utility ] Unable to process user config file: [ERROR] OS Error: No such file or directory [2020-12-16 13:29:01.018152 +0100][Debug ][PlugInMgr ] Initializing plug-in manager... [2020-12-16 13:29:01.018254 +0100][Debug ][PlugInMgr ] No default plug-in, loading plug-in configs... [2020-12-16 13:29:01.018302 +0100][Debug ][PlugInMgr ] Processing plug-in definitions in /etc/xrootd/client.plugins.d... [2020-12-16 13:29:01.020375 +0100][Debug ][PlugInMgr ] Processing plug-in definitions in /alrb/.xrootd/client.plugins.d... [2020-12-16 13:29:01.020433 +0100][Debug ][PlugInMgr ] Unable to process directory /alrb/.xrootd/client.plugins.d: [ERROR] OS Error: No such file or directory [2020-12-16 13:29:02.298776 +0100][Dump ][Utility ] URL: root://eosatlas.cern.ch//eos/atlas/atlasdatadisk/rucio/mc16_13TeV/25/31/DAOD_HIGG2D1.23315648._000001.pool.root.1 `

So one sees the 'try to open file', then there is the TFile::Open, and nothing else. So I conclude that this is coming from the Open.

     see you, RD
rdschaffer commented 3 years ago

Well, one thing that is clear is that this problem seems to be associated with specific sites. For my 'test' job:

test job

The sites that are successful either have local reading, or they use xrootd without problems. The latter are: SWT2_CPB IN2P3-LPSC_LAKE RAL

For the failures, these are all just xrootd problems, at sites: IN2P3-CPPM CERN-T0 TOKYO BNL

So I would suspect some difference in the xrootd installation between these two sites. (I personally have no idea how to check this.)

simonmichal commented 3 years ago

@rdschaffer : could you add following code to your job:

#include <link.h>
#include <stdlib.h>
#include <stdio.h>

static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
    int j;

   printf("name=%s (%d segments)\n", info->dlpi_name,
        info->dlpi_phnum);

   for (j = 0; j < info->dlpi_phnum; j++)
         printf("\t\t header %2d: address=%10p\n", j,
             (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
    return 0;
}

and then at the beginning of your main:

dl_iterate_phdr(callback, NULL);

This will print paths of all the loaded shared libraries to stdout.

rdschaffer commented 3 years ago

Hi @simonmichal,

Jobs are running. For a "failed" job at our CERN T0 reading from eos, have a look here. Let me know if you don't have access, and I'll make a pdf file.

        see you, RD
rdschaffer commented 3 years ago

I don't see libXrxxx in the list. Would this appear later after a request to xrootd has been made?

I put the call at the very beginning, as suggested:

int main( int argc, char* argv[] ) { // setup callback to debugging xrootd problem - RDS 2020/12 dl_iterate_phdr(callback, NULL);

 see you, RD
simonmichal commented 3 years ago

@rodwalker : hmm, let me dwell on this for a minute ...

rdschaffer commented 3 years ago

And here is a job output from RAL where xrootd seems to work.

More generally, here is a jobset which has jobs which both succeed at some sites with xrootd or file staging, and the fails with xrootd at other sites.

simonmichal commented 3 years ago

I'm bit puzzled here, if I link dummy main with libNetxNG.so and print out all used shared libs with dl_iterate_phdr the output includes xrootd libs. @rdschaffer : could you try moving the dl_iterate_phdr just before the open request gets issued, maybe this will help (fingers crossed)?

Axel-Naumann commented 3 years ago

@simonmichal maybe they use -Wl,--as-needed, and pull the xrd libraries in through their or ROOT's plugin mechanism?

rdschaffer commented 3 years ago

OK, thanks. Now I can try to add this just before TFile::Open, but as Axel says, it still might not work. Might there be a way to 'load' the xrootd lib, and then call this?

Axel-Naumann commented 3 years ago

TFile::Open("xroot://this-will-totally-fail"); right before the call to dl_iterate_phdr? But I cannot tell whether that still gives the results that @simonmichal is after...

rdschaffer commented 3 years ago

I expect that this will not get to the call to ls_iterate_phdr, and ROOT will just catch the segFault and return status code 139... But I can try. Perhaps there might be a way to 'load' explicitly xrootd?

Axel-Naumann commented 3 years ago

My call to TFile::Open certainly shouldn't segfault.

rdschaffer commented 3 years ago

ok

simonmichal commented 3 years ago

It is silly that the version is not reported in the logs when the client lib is loaded (well, now I've added it: https://github.com/xrootd/xrootd/commit/07e6d5db3bd086e3ebd0576b7e73b6a6aa62b902), my fear is that xrootd5 libs are being load.

rdschaffer commented 3 years ago

OK, so the jobs are running here with several "finished/ok" and "failed".

Example log file: for ok at RAL - RAL

and failed at CERN T0 CERN T0

To my eye, the same libs are being used. But you should tell me.

   see you, RD
simonmichal commented 3 years ago

These are the libs from failed job:

name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2 (6 segments)

and these are the libs from a job that was successful:

name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdUtils.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdClient.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdCl.so.2 (6 segments)
name=/cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib/libXrdXml.so.2 (6 segments)

yes, they are the same, so it's not a problem with loading a wrong version of xrootd client.

simonmichal commented 3 years ago

P. S. @rdschaffer : thanks for running the test!

rdschaffer commented 3 years ago

Sure. This is what I thought. And ideas how to further understand what is going on?

thanks, RD
simonmichal commented 3 years ago

@rdschaffer : have you tried comparing the full list of shared libs for a failed and successful job? Are there any differences at all? (unfortunately the links to the logs seem to be expired)

rdschaffer commented 3 years ago

Hi @simonmichal,

Doing a diff of the lines with "name=/" shows that the failed and succeeding jobs have the same libs.

I can still get to the following outputs:

failed succeeded

   see you, RD
rdschaffer commented 3 years ago

The 'succeed' job seems to get to:

[2020-12-18 14:29:25.905501 +0000][Debug  ][XRootDTransport   ] [xrootd.echo.stfc.ac.uk:1094 #0.0] Sending out kXR_login request, username: tatls002, cgi: ?xrd.cc=uk&xrd.tz=0&xrd.appname=runH4lAnalRun2&xrd.info=&xrd.hostname=tatls002-2188754.0-lcg2453.gridpp.rl.ac.uk&xrd.rn=v4.10.0, dual-stack: false, private IPv4: true, private IPv6: false

but the failed ends before this point.

I can upload the files of these two logs, if it would help.

Axel-Naumann commented 3 years ago

@simonmichal so I guess it's not the libraries. What else can we do to debug this?

I can upload the files of these two logs, if it would help.

What's preventing this?

simonmichal commented 3 years ago

Hi guys, Happy New Year!

@rdschaffer : could we try reproducing this problem on open stack with CernVM image?

rdschaffer commented 3 years ago

Hi @simonmichal,

Sure, would be happy to do so. Although, I am not quite sure how to proceed. Perhaps @krasznaa could suggest how I could do so for our sw?

rdschaffer commented 3 years ago

Since I am a big ignorant here, would running on open stack mean running elsewhere than on our grid sites? I suspect that there is something connected to our grid sites which is causing this...

simonmichal commented 3 years ago

well, it would be great if we could reproduce the problem somewhere were we could do interactive debugging, meaning not on a batch farm, I said CernVM open stack hoping the environment will be somewhat similar (to job would still need to fetch remote data, right?)

Could you point us to your root analysis job and describe in few words how are you starting it?

rdschaffer commented 3 years ago

OK, interactively, this is what I do:

cd /afs/cern.ch/work/s/schaffer/public/work-21.xAOD.RD_devRel21_prod25_2/build setupATLAS lsetup "asetup 21.2.139,AnalysisBase" source ../build/x86_64-centos7-gcc8-opt/setup.sh ../run (or to any directory which you can write in) runH4lAnalRun2 -i root://eosatlas.cern.ch:1094//eos/atlas/atlasdatadisk/rucio/mc16_13TeV/85/26/DAOD_HIGG2D1.21658940._000001.pool.root.1 -d mc16_13TeV.345706.Sherpa_222_NNPDF30NNLO_ggllll_130M4l.deriv.DAOD_HIGG2D1.e6213_s3126_r9364_p4191 -e 5000

-e 5000 is just to read 5k events. This is reading an input file, what we call a derived AOD, and writes out a root file with simple trees.

the setupATLAS is defined by:

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'

But you'll need 'atlas' access, I presume.

This is built with:

cmake ../source make -jxx

Not sure if this helps...

rdschaffer commented 3 years ago

The TFile::Open is at:

https://gitlab.cern.ch/HZZ/HZZSoftware/HZZAnalRun2Code/-/blob/changes-for-v25-fJVT/H4lAnalysisRun2/Root/H4lAnalRun2Init.cxx#L1245

simonmichal commented 3 years ago

@rdschaffer : could you create a VM with an Atlas cert where I could run a test?

rdschaffer commented 3 years ago

OK, I will try to do this a bit later today.

krasznaa commented 3 years ago

Unfortunately if we could reproduce this issue ourselves, on one of our private machines, we would have started the conversation itself very differently. :frowning: Unfortunately I'm afraid that setting anything up on OpenStack would just be a waste of time right now.

@simonmichal, could you give us a complete list of system libraries/packages that XRootD (or at least the version of XRootD we have under /cvmfs/atlas.cern.ch/repo/sw/software/21.2/AnalysisBaseExternals/21.2.139/InstallArea/x86_64-centos7-gcc8-opt/lib) makes use of? With any educated guesses necessary. Then we would give this list to our grid experts, to check if they can see any difference in what is installed on the nodes on which @rdschaffer's jobs are working correctly, and on ones on which they don't.

Cheers, Attila

krasznaa commented 3 years ago

As for the "VM": We have an image of AnalysisBase-21.2.139 here:

https://hub.docker.com/layers/atlas/analysisbase/21.2.139/images/sha256-cf69e10defa9cb564dcb60c9ca723f0de9e7a1813f588bdde1d1a06a944c1e3e?context=repo

But the issue does not show up in it. So it's of not much help...