Open oshadura opened 1 year ago
Hi Oksana, can you confirm that the plugin is being built of the xcache branch? Some of the strings suggest it's coming from master. At least for hub.opensciencegrid.org/coffea-casa/cc-ubuntu:2023.03.17
.
Hi @oshadura and @jthiltges, I'm "the user" in Oksana's original post, and I thought it might be helpful to give a little context. The main functionality I'm looking for is to be able to list files on LPC EOS like I would with xrdfs root://cmseos.fnal.gov ls
. Of course fixing this such that xrootd works in general would be great, but if you know another good way to do this from coffea-casa, I'd happily do that instead :)
Now since we have deployed @jthiltges plugin Segmentation fault (core dumped)
is fixed, but still some functionality, such as xrdfs
is missing:
# the following command works on lxplus
$ xrdfs root://cmseos.fnal.gov// ls /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000/
# but the equivalent command hangs on coffea-casa
$ xrdfs root://xcache// ls /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000/
# even though the same command works on coffea-casa if I specify one specific file
$ xrdfs root://xcache// ls /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000/ffNtuple_1.root
Interesting result. This appears to partially be an issue with our xcache (running in docker).
The xcache tells the client to contact 172.23.0.2, which is a private IP of the xcache container. And as expected, the client cannot connect.
$ xrdfs red-xcache1.unl.edu:1094 locate '*'
[::172.23.0.2]:1094 Server ReadWrite
$ xrdfs xcache:1094 locate '*'
[::172.23.0.2]:1094 Server ReadWrite
For now, I switched the red-xcache container over to host-mode networking (network_mode: host
) and the ls proceeds to fail differently
$ xrdfs red-xcache1.unl.edu ls /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000
[ERROR] Server responded with an error: [3005] Unable to open directory /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000; too many levels of symbolic links
On the xcache server side:
230411 17:53:14 543 scitokens_Access: Grant authorization based on scopes for operation=dir, path=/store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000
[2023-04-11 17:53:18.077326 +0000][Warning][XRootD ] [u26@cms-xrd-global.cern.ch:1094] Redirect limit has been reached for message kXR_dirlist (path: /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000), the last known error is: [ERROR] Error response: no such file or directory
230411 17:53:18 543 ofs_opendir: cms-jovy.405:26@c2427.shor.hcc Unable to open directory /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000; too many levels of symbolic links
230411 17:53:18 543 cms-jovy.405:26@c2427.shor.hcc Xrootd_Response: sending err 3005: Unable to open directory /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000; too many levels of symbolic links
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] Redirect trace-back:
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] 0. Redirected from: root://cmsxrootd.fnal.gov:1094//store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000 to: root://cms-xrd-global.cern.ch:1094/
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] 1. Redirected from: root://cms-xrd-global.cern.ch:1094/ to: root://cms-xrd-transit.cern.ch:1094/
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] 2. Retrying: root://cms-xrd-global.cern.ch:1094/
...
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] 29. Redirected from: root://cms-xrd-global.cern.ch:1094/ to: root://cms-xrd-transit.cern.ch:1094/
[2023-04-11 17:53:18.077682 +0000][Warning][XRootD ] 30. Retrying: root://cms-xrd-global.cern.ch:1094/
230411 17:53:18 543 XrdTLS: cms-jovy.405:26@c2427.shor.hcc TLS error rc=0 ec=6 (zero_return) errno=0.
230411 17:53:18 543 XrootdXeq: cms-jovy.405:26@c2427.shor.hcc disc 0:00:04
I suspect that listing directory contents will be painfully slow if the request doesn't go directly to the target server/cluster. Otherwise, I'm guessing it will result in a search of the entire hierarchy.
Update, the second query now end up showing too many levels of symbolic links
error:
cms-jovyan@jupyter-oksana-2eshadura-40cern-2ech:~$ xrdfs root://xcache// ls /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000/
[ERROR] Server responded with an error: [3005] Unable to open directory /store/group/lpcmetx/SIDM/ffNtupleV4/2018/SIDM_XXTo2ATo2Mu2E_mXX-100_mA-1p2_ctau-9p6_TuneCP5_13TeV-madgraph-pythia8/RunIIAutumn18DRPremix-102X_upgrade2018_realistic_v15-v1/210326_161703/0000/; too many levels of symbolic links
The user reported that at CMS coffea-casa AF while using
xrdcp
to copy files, we see "Operation is not implemented" error:as well there is a segfault while using xrootd python API:
Current repository with plugin: https://github.com/jthiltges/xrdcl-authz-plugin/tree/xcache
cc @jthiltges