ESCAPE-WP2 / Rucio-Client-Containers

1 stars 4 forks source link

Unable to load plugin libgfal_plugin_gridftp.so #34

Open tamasgal opened 3 years ago

tamasgal commented 3 years ago

I am using the rucio-client Docker image and I get the error below when uploading a file. It's complaining about a missing GRIDFTP PLUGIN entry in libgfal_plugin_gridftp.so.

░ tamasgal@silentbox:~ 🅒 base
â–‘ 11:24:23 > docker images|grep rucio-client
projectescape/rucio-client   latest    9c3477bbad93   2 weeks ago   1.3GB
[user@ca86714987ed tmp]$ rucio --version
rucio 1.25.6
[user@ca86714987ed ~]$ rucio -vvv upload --scope KM3NET_ECAP_TGAL --rse IN2P3-CC-DCACHE --lifetime=3600 foo
2021-06-30 09:16:33,561 DEBUG   uploadclient.py Num. of files that upload client is processing: 1
2021-06-30 09:16:34,010 DEBUG   uploadclient.py Input validation done.
2021-06-30 09:16:34,011 INFO    Preparing upload for file foo
2021-06-30 09:16:34,067 DEBUG   uploadclient.py wan domain is used for the upload
2021-06-30 09:16:34,067 DEBUG   uploadclient.py Registering file
2021-06-30 09:16:34,126 DEBUG   uploadclient.py Skipping dataset registration
2021-06-30 09:16:34,180 DEBUG   uploadclient.py File DID does not exist
2021-06-30 09:16:34,267 INFO    Successfully added replica in Rucio catalogue at IN2P3-CC-DCACHE
2021-06-30 09:16:34,409 INFO    Successfully added replication rule at IN2P3-CC-DCACHE
2021-06-30 09:16:34,490 DEBUG   logging.py  gfal.Default: connecting to storage
2021-06-30 09:16:34,576 DEBUG   rucio   Traceback (most recent call last):
  File "/usr/local/bin/rucio", line 131, in new_funct
    return function(*args, **kwargs)
  File "/usr/local/bin/rucio", line 956, in upload
    upload_client.upload(items, summary_file_path)
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 238, in upload
    if rsemgr.exists(rse_settings, pfn if pfn else file_did, domain=domain, auth_token=self.auth_token, logger=logger):
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/rsemanager.py", line 279, in exists
    protocol.connect()
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 200, in connect
    self.__ctx = gfal2.creat_context()  # pylint: disable=no-member
gfal2.GError: Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d

2021-06-30 09:16:34,582 ERROR   Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d
2021-06-30 09:16:34,583 ERROR
Rucio exited with an unexpected/unknown error.
Please rerun the last command with the "-v" option to gather more information.
If it's a problem concerning your experiment or if you're unsure what to do, please followup at: https://github.com/rucio/rucio/issues/
If you're sure there is a problem with Rucio itself, please followup at: https://github.com/rucio/rucio/issues/
Completed in 1.0455 sec.
robbarnsley commented 3 years ago

Hi Tamas,

I can't replicate this with my own setup.

May I suggest replicating the steps exactly below (substituting credentials where necessary)?

rbarnsley@escape-rucio-analysis:~$ :~$ docker run -u root --rm -e RUCIO_CFG_ACCOUNT=robbarnsley -v /home/rbarnsley/.globus/client.crt:/opt/rucio/etc/client.crt -v /home/rbarnsley/.globus/client.key:/opt/rucio/etc/client.key  --name rucio-client-test --entrypoint /bin/bash projectescape/rucio-client:latest
File rucio.cfg not found. It will generate one.
Enable shell completion on the rucio commands

[root@1fc3567899b7 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.

Created proxy in /tmp/x509up_u0.

Your proxy is valid until Wed Jun 30 23:25:34 UTC 2021

[root@1fc3567899b7 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio -vvv upload --rse IN2P3-CC-DCACHE --lifetime 3600 --scope SKA_SKAO_BARNSLEY-testing $FILE
2021-06-30 11:25:44,514 DEBUG   uploadclient.py Num. of files that upload client is processing: 1
2021-06-30 11:25:44,748 DEBUG   uploadclient.py Input validation done.
2021-06-30 11:25:44,749 INFO    Preparing upload for file 62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:44,785 DEBUG   uploadclient.py wan domain is used for the upload
2021-06-30 11:25:44,785 DEBUG   uploadclient.py Registering file
2021-06-30 11:25:44,826 DEBUG   uploadclient.py Skipping dataset registration
2021-06-30 11:25:44,862 DEBUG   uploadclient.py File DID does not exist
2021-06-30 11:25:44,929 INFO    Successfully added replica in Rucio catalogue at IN2P3-CC-DCACHE
2021-06-30 11:25:45,001 INFO    Successfully added replication rule at IN2P3-CC-DCACHE
2021-06-30 11:25:45,014 DEBUG   logging.py  gfal.Default: connecting to storage
2021-06-30 11:25:45,042 DEBUG   logging.py  gfal.Default: checking if file exists None
2021-06-30 11:25:45,105 DEBUG   rsemanager.py   Checking if root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b exists
2021-06-30 11:25:45,105 DEBUG   logging.py  gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,332 DEBUG   logging.py  gfal.Default: closing protocol connection
2021-06-30 11:25:45,333 DEBUG   uploadclient.py [{'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'davs', 'port': 2880, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 3, 'write': 3, 'delete': 3}, 'wan': {'read': 3, 'write': 3, 'delete': 3, 'third_party_copy': 3}}, 'extended_attributes': None}, {'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'gsiftp', 'port': 2811, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 2, 'write': 2, 'delete': 2}, 'wan': {'read': 2, 'write': 2, 'delete': 2, 'third_party_copy': 2}}, 'extended_attributes': None}, {'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'root', 'port': 1094, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 1, 'write': 1, 'delete': 1}, 'wan': {'read': 1, 'write': 1, 'delete': 1, 'third_party_copy': 1}}, 'extended_attributes': None}]
2021-06-30 11:25:45,334 INFO    Trying upload with root to IN2P3-CC-DCACHE
2021-06-30 11:25:45,334 DEBUG   uploadclient.py Processing upload with the domain: wan
2021-06-30 11:25:45,335 DEBUG   logging.py  gfal.Default: connecting to storage
2021-06-30 11:25:45,337 DEBUG   logging.py  gfal.Default: connecting to storage
2021-06-30 11:25:45,339 DEBUG   uploadclient.py The PFN created from the LFN: root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,339 DEBUG   logging.py  gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,366 DEBUG   logging.py  gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,394 DEBUG   utils.py    put: Attempt 1
2021-06-30 11:25:45,394 DEBUG   logging.py  gfal.Default: uploading file from 62e91efc-6de8-416e-8074-5f38bbcabb1b to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,835 INFO    Successful upload of temporary file. root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,836 DEBUG   uploadclient.py skip_upload_stat=False
2021-06-30 11:25:45,837 DEBUG   uploadclient.py stat: pfn=root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,838 DEBUG   logging.py  gfal.Default: getting stats of file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,927 DEBUG   uploadclient.py Filesize: Expected=5 Found=5
2021-06-30 11:25:45,927 DEBUG   uploadclient.py Checksum: Expected=062801cb Found=062801cb
2021-06-30 11:25:45,927 DEBUG   uploadclient.py Renaming file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,927 DEBUG   logging.py  gfal.Default: renaming file from root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,990 DEBUG   logging.py  gfal.Default: closing protocol connection
2021-06-30 11:25:45,991 DEBUG   logging.py  gfal.Default: closing protocol connection
2021-06-30 11:25:45,991 DEBUG   uploadclient.py Upload done.
2021-06-30 11:25:45,992 INFO    Successfully uploaded file 62e91efc-6de8-416e-8074-5f38bbcabb1b
Completed in 1.8044 sec.
tamasgal commented 3 years ago

Thanks for the quick response!

I realised that the voms-proxy-init command did not work properly (overlooked it, probably due to copy&paste):

[user@057df3bb3326 ~]$ voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Credentials couldn't be loaded [/opt/rucio/etc/client.key, /opt/rucio/etc/client.crt]: The PEM input does not contain a private key, it was parsed as org.bouncycastle.cert.X509CertificateHolder
No credentials found!

So I was using the wrong key/crt-pair, however I still don't understand how this leads to that library error.

Anyways... I now pointed the Docker binds to the correct ones but the voms-proxy-init command seems to freeze quite a lot. I needed a few tries to be able to create a proxy and now I get another error when I try to upload. I am a bit confused, I use this key/crt pair also for the CVMFS management and there the proxy and the connections work fine.

[root@cda439daf894 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.

Created proxy in /tmp/x509up_u0.

Your proxy is valid until Wed Jun 30 23:50:26 GMT 2021
[root@cda439daf894 user]# echo narf > foo
[root@cda439daf894 user]# rucio -vvv upload --scope KM3NET_ECAP_TGAL --rse IN2P3-CC-DCACHE --lifetime=3600 foo
2021-06-30 11:51:34,760 ERROR   ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,810 ERROR   ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,858 ERROR   ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,859 ERROR   Internal error: Request for authentication token returned no result!
2021-06-30 11:51:34,860 ERROR   Cannot authenticate.
Details: x509 authentication failed for account=tgal with identity={'client_cert': '/opt/rucio/etc/client.crt', 'client_key': '/opt/rucio/etc/client.key'}
ygrange commented 3 years ago

My guess would be that you are not using the configuration option for authenticating with the proxy, and it tries to authenticate with your cert/key which are encrypted. That is a slightly different configuration which I will have to need to look up.

tamasgal commented 3 years ago

Yes, Riccardo spotted the same issue, we just had a quick chat in the ESCAPE #datalake_status_and_report room.

I will try to create a password-less key/cert pair, once I figured out how to do it 😉

ygrange commented 3 years ago

Ah cool. So what you could do is use -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /tmp/x509up_u0:/tmp/x509up_u0 as flags for the docker run (basically setting auth type to x509_proxy, linking the proxy into the container and pointing Rucio to it.

ygrange commented 3 years ago

(so basically $ docker run -e RUCIO_CFG_ACCOUNT=<myrucioaccount> -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /tmp/x509up_u0:/tmp/x509up_u0 -it --name=rucio-client rucio-client (I haven't actually tested this actually works right now, but it should be close enough :o )

tamasgal commented 3 years ago

Thanks! I just tried but now it's hanging at the voms-proxy-init command. I tried several times but cannot get past the Remote VOMS server contacted successfully message, so no proxy at this moment... 😕 I'll continue to retry.

[root@a7300c88575a user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.

(hangs)
ygrange commented 3 years ago

Oh wait! You are running the voms-proxy-init inside the container too of course! My bad. So the command line for Docker should actually have the key and crt linked and the tmp file not... So $ docker run -e RUCIO_CFG_ACCOUNT=<myrucioaccount> -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /path/to/client.crt:/opt/rucio/etc/client.crt -v /path/to/client.key:/opt/rucio/etc/client.key -it --name=rucio-client rucio-client (and then voms-proxy-init inside the container, and then Rucio inside the container). (also note that if you run as a non-root user in the container, the x509 proxy name contains your userid. So you could replace x509up_u0 by x509up_u$(id -u) if you want to cover all cases.

tamasgal commented 3 years ago

Alright, I tried again but still no luck 🙈 I get the original error message with the missing key in lib now:

(EDIT: I made sure that the user IDs match!)

[root@bc22b30584c3 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.

Created proxy in /tmp/x509up_u0.

Your proxy is valid until Thu Jul 01 02:56:09 GMT 2021
[root@bc22b30584c3 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio -vvv upload --rse IN2P3-CC-DCACHE --lifetime 3600 --scope KM3NET_ECAP_TGAL $FILE
2021-06-30 14:56:51,623 DEBUG   uploadclient.py Num. of files that upload client is processing: 1
2021-06-30 14:56:52,084 DEBUG   uploadclient.py Input validation done.
2021-06-30 14:56:52,084 INFO    Preparing upload for file 843ca80a-7d2b-484a-8cad-660ec1212fca
2021-06-30 14:56:52,149 DEBUG   uploadclient.py wan domain is used for the upload
2021-06-30 14:56:52,150 DEBUG   uploadclient.py Registering file
2021-06-30 14:56:52,217 DEBUG   uploadclient.py Skipping dataset registration
2021-06-30 14:56:52,293 DEBUG   uploadclient.py File DID does not exist
2021-06-30 14:56:52,378 INFO    Successfully added replica in Rucio catalogue at IN2P3-CC-DCACHE
2021-06-30 14:56:52,491 INFO    Successfully added replication rule at IN2P3-CC-DCACHE
2021-06-30 14:56:52,556 DEBUG   logging.py  gfal.Default: connecting to storage
2021-06-30 14:56:52,734 DEBUG   rucio   Traceback (most recent call last):
  File "/usr/local/bin/rucio", line 131, in new_funct
    return function(*args, **kwargs)
  File "/usr/local/bin/rucio", line 956, in upload
    upload_client.upload(items, summary_file_path)
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 238, in upload
    if rsemgr.exists(rse_settings, pfn if pfn else file_did, domain=domain, auth_token=self.auth_token, logger=logger):
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/rsemanager.py", line 279, in exists
    protocol.connect()
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 200, in connect
    self.__ctx = gfal2.creat_context()  # pylint: disable=no-member
gfal2.GError: Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d

2021-06-30 14:56:52,746 ERROR   Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d
2021-06-30 14:56:52,749 ERROR
Rucio exited with an unexpected/unknown error.
Please rerun the last command with the "-v" option to gather more information.
If it's a problem concerning your experiment or if you're unsure what to do, please followup at: https://github.com/rucio/rucio/issues/
If you're sure there is a problem with Rucio itself, please followup at: https://github.com/rucio/rucio/issues/
Completed in 1.6769 sec.
ygrange commented 3 years ago

Oh joyous joy :). I need to check something in Rucio so let me actually try to do it this way

ygrange commented 3 years ago

So I followed what you did and for me it seems to work. Few things I can imagine being an issue here:

  1. could it be that the .so is built for a different CPU architecture (this can be a pain in containers)? I think my CPU is intel (standard macbook CPU)
  2. When I run the container, it is running as user user (with id 1000). I would not be surprised if some things horribly fail when executed as root. I am not sure if you tried doing this all as user user inside the container...
tamasgal commented 3 years ago

Yes, it is probably an incompatible library. I have an M1 (ARM) processor. I made sure that the user IDs match, so that is not the problem.

I'll see if I can recompile the whole Docker image with the M1.

tamasgal commented 3 years ago

Sorry for the spam, I realised that it's a CentOS 7 base image, which indeed provides a good set of aarch64 packages but fails when looking up ius-release.

Not sure how to solve this, I guess I'll talk to the maintainers of the base image and then see how we can build an aarch64 image. This will definitely be needed since the new Apple M1 machines are getting more and more popular.

tamasgal commented 3 years ago

Ok, I finally had time to try it on a different (Intel) machine and it works. Although I get some errors during upload, it eventually said "successfully uploaded...". I was not able to find the files afterwards, but that's most probably just my lack of understanding how to access it, I'll dig deeper 😉

Here is the part where the error showed, just for the sake of completeness:

2021-07-02 07:18:47,136 DEBUG   logging.py  gfal.Default: getting stats of file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/KM3NET_ECAP_TGAL/a5/85/f62a38c9-f624-43f2-a997-db3c45d4eace.rucio.upload
2021-07-02 07:18:47,159 DEBUG   uploadclient.py stat: unexpected error=The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
2021-07-02 07:18:47,159 WARNING Upload attempt failed
2021-07-02 07:18:47,159 INFO    Exception: The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 367, in stat
    stat_str = str(ctx.stat(path))
gfal2.GError: Failed to stat file (No such file or directory)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 281, in upload
    sign_service=sign_service)
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 655, in _upload_item
    raise error
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 639, in _upload_item
    stats = self._retry_protocol_stat(protocol_write, pfn_tmp)
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 693, in _retry_protocol_stat
    raise error
  File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 680, in _retry_protocol_stat
    stats = protocol.stat(pfn)
  File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 370, in stat
    raise exception.ServiceUnavailable(msg % str(error))
rucio.common.exception.ServiceUnavailable: The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
2021-07-02 07:18:47,160 INFO    Trying upload with gsiftp to IN2P3-CC-DCACHE
2021-07-02 07:18:47,160 DEBUG   uploadclient.py Processing upload with the domain: wan
2021-07-02 07:18:47,160 DEBUG   logging.py  gfal.Default: connecting to storage
2021-07-02 07:18:47,161 DEBUG   logging.py  gfal.Default: connecting to storage
2021-07-02 07:18:47,162 DEBUG   uploadclient.py The PFN created from the LFN: gsiftp://ccdcalitest10.in2p3.fr:2811//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/KM3NET_ECAP_TGAL/a5/85/f62a38c9-f624-43f2-a997-db3c45d4eace
ygrange commented 3 years ago

Looks to me like the error is "2021-07-02 07:18:47,159 DEBUG uploadclient.py stat: unexpected error=The requested service is not available at the moment.". Could you try another RSE? (I think EULAKE-1, sara and desy should be more or less always online :).

On the so issue: I think we should raise this upstream. I asked whether it is already known in the Rucio slack. Let's see whether they have a fix :) (e.g. a specific docker container for ARM).

ygrange commented 3 years ago

Conclusion: this was already reported upstream to the GFAL who need to provide a new .so file :)

tamasgal commented 3 years ago

Alright, thanks ;) I also created an issue yesterday: https://github.com/rucio/containers/issues/143

tamasgal commented 3 years ago

We now have ARM packages for GFAL on EPEL, so technically it should now be possible (https://github.com/rucio/containers/issues/143#issuecomment-927712221)