Open tamasgal opened 3 years ago
Hi Tamas,
I can't replicate this with my own setup.
May I suggest replicating the steps exactly below (substituting credentials where necessary)?
rbarnsley@escape-rucio-analysis:~$ :~$ docker run -u root --rm -e RUCIO_CFG_ACCOUNT=robbarnsley -v /home/rbarnsley/.globus/client.crt:/opt/rucio/etc/client.crt -v /home/rbarnsley/.globus/client.key:/opt/rucio/etc/client.key --name rucio-client-test --entrypoint /bin/bash projectescape/rucio-client:latest
File rucio.cfg not found. It will generate one.
Enable shell completion on the rucio commands
[root@1fc3567899b7 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.
Created proxy in /tmp/x509up_u0.
Your proxy is valid until Wed Jun 30 23:25:34 UTC 2021
[root@1fc3567899b7 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio -vvv upload --rse IN2P3-CC-DCACHE --lifetime 3600 --scope SKA_SKAO_BARNSLEY-testing $FILE
2021-06-30 11:25:44,514 DEBUG uploadclient.py Num. of files that upload client is processing: 1
2021-06-30 11:25:44,748 DEBUG uploadclient.py Input validation done.
2021-06-30 11:25:44,749 INFO Preparing upload for file 62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:44,785 DEBUG uploadclient.py wan domain is used for the upload
2021-06-30 11:25:44,785 DEBUG uploadclient.py Registering file
2021-06-30 11:25:44,826 DEBUG uploadclient.py Skipping dataset registration
2021-06-30 11:25:44,862 DEBUG uploadclient.py File DID does not exist
2021-06-30 11:25:44,929 INFO Successfully added replica in Rucio catalogue at IN2P3-CC-DCACHE
2021-06-30 11:25:45,001 INFO Successfully added replication rule at IN2P3-CC-DCACHE
2021-06-30 11:25:45,014 DEBUG logging.py gfal.Default: connecting to storage
2021-06-30 11:25:45,042 DEBUG logging.py gfal.Default: checking if file exists None
2021-06-30 11:25:45,105 DEBUG rsemanager.py Checking if root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b exists
2021-06-30 11:25:45,105 DEBUG logging.py gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,332 DEBUG logging.py gfal.Default: closing protocol connection
2021-06-30 11:25:45,333 DEBUG uploadclient.py [{'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'davs', 'port': 2880, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 3, 'write': 3, 'delete': 3}, 'wan': {'read': 3, 'write': 3, 'delete': 3, 'third_party_copy': 3}}, 'extended_attributes': None}, {'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'gsiftp', 'port': 2811, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 2, 'write': 2, 'delete': 2}, 'wan': {'read': 2, 'write': 2, 'delete': 2, 'third_party_copy': 2}}, 'extended_attributes': None}, {'hostname': 'ccdcalitest10.in2p3.fr', 'scheme': 'root', 'port': 1094, 'prefix': '//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache', 'impl': 'rucio.rse.protocols.gfal.Default', 'domains': {'lan': {'read': 1, 'write': 1, 'delete': 1}, 'wan': {'read': 1, 'write': 1, 'delete': 1, 'third_party_copy': 1}}, 'extended_attributes': None}]
2021-06-30 11:25:45,334 INFO Trying upload with root to IN2P3-CC-DCACHE
2021-06-30 11:25:45,334 DEBUG uploadclient.py Processing upload with the domain: wan
2021-06-30 11:25:45,335 DEBUG logging.py gfal.Default: connecting to storage
2021-06-30 11:25:45,337 DEBUG logging.py gfal.Default: connecting to storage
2021-06-30 11:25:45,339 DEBUG uploadclient.py The PFN created from the LFN: root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,339 DEBUG logging.py gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,366 DEBUG logging.py gfal.Default: checking if file exists root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,394 DEBUG utils.py put: Attempt 1
2021-06-30 11:25:45,394 DEBUG logging.py gfal.Default: uploading file from 62e91efc-6de8-416e-8074-5f38bbcabb1b to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,835 INFO Successful upload of temporary file. root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,836 DEBUG uploadclient.py skip_upload_stat=False
2021-06-30 11:25:45,837 DEBUG uploadclient.py stat: pfn=root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,838 DEBUG logging.py gfal.Default: getting stats of file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload
2021-06-30 11:25:45,927 DEBUG uploadclient.py Filesize: Expected=5 Found=5
2021-06-30 11:25:45,927 DEBUG uploadclient.py Checksum: Expected=062801cb Found=062801cb
2021-06-30 11:25:45,927 DEBUG uploadclient.py Renaming file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,927 DEBUG logging.py gfal.Default: renaming file from root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b.rucio.upload to root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/SKA_SKAO_BARNSLEY-testing/09/6a/62e91efc-6de8-416e-8074-5f38bbcabb1b
2021-06-30 11:25:45,990 DEBUG logging.py gfal.Default: closing protocol connection
2021-06-30 11:25:45,991 DEBUG logging.py gfal.Default: closing protocol connection
2021-06-30 11:25:45,991 DEBUG uploadclient.py Upload done.
2021-06-30 11:25:45,992 INFO Successfully uploaded file 62e91efc-6de8-416e-8074-5f38bbcabb1b
Completed in 1.8044 sec.
Thanks for the quick response!
I realised that the voms-proxy-init
command did not work properly (overlooked it, probably due to copy&paste):
[user@057df3bb3326 ~]$ voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Credentials couldn't be loaded [/opt/rucio/etc/client.key, /opt/rucio/etc/client.crt]: The PEM input does not contain a private key, it was parsed as org.bouncycastle.cert.X509CertificateHolder
No credentials found!
So I was using the wrong key/crt-pair, however I still don't understand how this leads to that library error.
Anyways... I now pointed the Docker binds to the correct ones but the voms-proxy-init
command seems to freeze quite a lot. I needed a few tries to be able to create a proxy and now I get another error when I try to upload. I am a bit confused, I use this key/crt pair also for the CVMFS management and there the proxy and the connections work fine.
[root@cda439daf894 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.
Created proxy in /tmp/x509up_u0.
Your proxy is valid until Wed Jun 30 23:50:26 GMT 2021
[root@cda439daf894 user]# echo narf > foo
[root@cda439daf894 user]# rucio -vvv upload --scope KM3NET_ECAP_TGAL --rse IN2P3-CC-DCACHE --lifetime=3600 foo
2021-06-30 11:51:34,760 ERROR ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,810 ERROR ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,858 ERROR ConnectionError: HTTPSConnectionPool(host='escape-rucio-auth.cern.ch', port=443): Max retries exceeded with url: /auth/x509 (Caused by SSLError('Client private key is encrypted, password is required',))
2021-06-30 11:51:34,859 ERROR Internal error: Request for authentication token returned no result!
2021-06-30 11:51:34,860 ERROR Cannot authenticate.
Details: x509 authentication failed for account=tgal with identity={'client_cert': '/opt/rucio/etc/client.crt', 'client_key': '/opt/rucio/etc/client.key'}
My guess would be that you are not using the configuration option for authenticating with the proxy, and it tries to authenticate with your cert/key which are encrypted. That is a slightly different configuration which I will have to need to look up.
Yes, Riccardo spotted the same issue, we just had a quick chat in the ESCAPE #datalake_status_and_report room.
I will try to create a password-less key/cert pair, once I figured out how to do it 😉
Ah cool. So what you could do is use -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /tmp/x509up_u0:/tmp/x509up_u0
as flags for the docker run (basically setting auth type to x509_proxy, linking the proxy into the container and pointing Rucio to it.
(so basically $ docker run -e RUCIO_CFG_ACCOUNT=<myrucioaccount> -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /tmp/x509up_u0:/tmp/x509up_u0 -it --name=rucio-client rucio-client
(I haven't actually tested this actually works right now, but it should be close enough :o )
Thanks! I just tried but now it's hanging at the voms-proxy-init
command. I tried several times but cannot get past the Remote VOMS server contacted successfully
message, so no proxy at this moment... 😕 I'll continue to retry.
[root@a7300c88575a user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.
(hangs)
Oh wait! You are running the voms-proxy-init inside the container too of course! My bad. So the command line for Docker should actually have the key and crt linked and the tmp file not...
So $ docker run -e RUCIO_CFG_ACCOUNT=<myrucioaccount> -e RUCIO_CFG_AUTH_TYPE=x509_proxy -e RUCIO_CFG_CLIENT_X509_PROXY=/tmp/x509up_u0 -v /path/to/client.crt:/opt/rucio/etc/client.crt -v /path/to/client.key:/opt/rucio/etc/client.key -it --name=rucio-client rucio-client
(and then voms-proxy-init inside the container, and then Rucio inside the container).
(also note that if you run as a non-root user in the container, the x509 proxy name contains your userid. So you could replace x509up_u0
by x509up_u$(id -u)
if you want to cover all cases.
Alright, I tried again but still no luck 🙈 I get the original error message with the missing key in lib now:
(EDIT: I made sure that the user IDs match!)
[root@bc22b30584c3 user]# voms-proxy-init --cert /opt/rucio/etc/client.crt --key /opt/rucio/etc/client.key --voms escape
Enter GRID pass phrase for this identity:
Contacting voms-escape.cloud.cnaf.infn.it:15000 [/DC=org/DC=terena/DC=tcs/C=IT/L=Frascati/O=Istituto Nazionale di Fisica Nucleare/OU=Istituto Nazionale di Fisica Nucleare/CN=voms-escape.cloud.cnaf.infn.it] "escape"...
Remote VOMS server contacted succesfully.
Created proxy in /tmp/x509up_u0.
Your proxy is valid until Thu Jul 01 02:56:09 GMT 2021
[root@bc22b30584c3 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio -vvv upload --rse IN2P3-CC-DCACHE --lifetime 3600 --scope KM3NET_ECAP_TGAL $FILE
2021-06-30 14:56:51,623 DEBUG uploadclient.py Num. of files that upload client is processing: 1
2021-06-30 14:56:52,084 DEBUG uploadclient.py Input validation done.
2021-06-30 14:56:52,084 INFO Preparing upload for file 843ca80a-7d2b-484a-8cad-660ec1212fca
2021-06-30 14:56:52,149 DEBUG uploadclient.py wan domain is used for the upload
2021-06-30 14:56:52,150 DEBUG uploadclient.py Registering file
2021-06-30 14:56:52,217 DEBUG uploadclient.py Skipping dataset registration
2021-06-30 14:56:52,293 DEBUG uploadclient.py File DID does not exist
2021-06-30 14:56:52,378 INFO Successfully added replica in Rucio catalogue at IN2P3-CC-DCACHE
2021-06-30 14:56:52,491 INFO Successfully added replication rule at IN2P3-CC-DCACHE
2021-06-30 14:56:52,556 DEBUG logging.py gfal.Default: connecting to storage
2021-06-30 14:56:52,734 DEBUG rucio Traceback (most recent call last):
File "/usr/local/bin/rucio", line 131, in new_funct
return function(*args, **kwargs)
File "/usr/local/bin/rucio", line 956, in upload
upload_client.upload(items, summary_file_path)
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 238, in upload
if rsemgr.exists(rse_settings, pfn if pfn else file_did, domain=domain, auth_token=self.auth_token, logger=logger):
File "/usr/local/lib/python3.6/site-packages/rucio/rse/rsemanager.py", line 279, in exists
protocol.connect()
File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 200, in connect
self.__ctx = gfal2.creat_context() # pylint: disable=no-member
gfal2.GError: Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d
2021-06-30 14:56:52,746 ERROR Unable to load plugin /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so : Key file does not have group \u201cGRIDFTP PLUGIN\u201d
2021-06-30 14:56:52,749 ERROR
Rucio exited with an unexpected/unknown error.
Please rerun the last command with the "-v" option to gather more information.
If it's a problem concerning your experiment or if you're unsure what to do, please followup at: https://github.com/rucio/rucio/issues/
If you're sure there is a problem with Rucio itself, please followup at: https://github.com/rucio/rucio/issues/
Completed in 1.6769 sec.
Oh joyous joy :). I need to check something in Rucio so let me actually try to do it this way
So I followed what you did and for me it seems to work. Few things I can imagine being an issue here:
Yes, it is probably an incompatible library. I have an M1 (ARM) processor. I made sure that the user IDs match, so that is not the problem.
I'll see if I can recompile the whole Docker image with the M1.
Sorry for the spam, I realised that it's a CentOS 7 base image, which indeed provides a good set of aarch64 packages but fails when looking up ius-release
.
Not sure how to solve this, I guess I'll talk to the maintainers of the base image and then see how we can build an aarch64 image. This will definitely be needed since the new Apple M1 machines are getting more and more popular.
Ok, I finally had time to try it on a different (Intel) machine and it works. Although I get some errors during upload, it eventually said "successfully uploaded...". I was not able to find the files afterwards, but that's most probably just my lack of understanding how to access it, I'll dig deeper 😉
Here is the part where the error showed, just for the sake of completeness:
2021-07-02 07:18:47,136 DEBUG logging.py gfal.Default: getting stats of file root://ccdcalitest10.in2p3.fr:1094//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/KM3NET_ECAP_TGAL/a5/85/f62a38c9-f624-43f2-a997-db3c45d4eace.rucio.upload
2021-07-02 07:18:47,159 DEBUG uploadclient.py stat: unexpected error=The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
2021-07-02 07:18:47,159 WARNING Upload attempt failed
2021-07-02 07:18:47,159 INFO Exception: The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 367, in stat
stat_str = str(ctx.stat(path))
gfal2.GError: Failed to stat file (No such file or directory)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 281, in upload
sign_service=sign_service)
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 655, in _upload_item
raise error
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 639, in _upload_item
stats = self._retry_protocol_stat(protocol_write, pfn_tmp)
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 693, in _retry_protocol_stat
raise error
File "/usr/local/lib/python3.6/site-packages/rucio/client/uploadclient.py", line 680, in _retry_protocol_stat
stats = protocol.stat(pfn)
File "/usr/local/lib/python3.6/site-packages/rucio/rse/protocols/gfal.py", line 370, in stat
raise exception.ServiceUnavailable(msg % str(error))
rucio.common.exception.ServiceUnavailable: The requested service is not available at the moment.
Details: Error while processing gfal stat call. Error: Failed to stat file (No such file or directory)
2021-07-02 07:18:47,160 INFO Trying upload with gsiftp to IN2P3-CC-DCACHE
2021-07-02 07:18:47,160 DEBUG uploadclient.py Processing upload with the domain: wan
2021-07-02 07:18:47,160 DEBUG logging.py gfal.Default: connecting to storage
2021-07-02 07:18:47,161 DEBUG logging.py gfal.Default: connecting to storage
2021-07-02 07:18:47,162 DEBUG uploadclient.py The PFN created from the LFN: gsiftp://ccdcalitest10.in2p3.fr:2811//pnfs/in2p3.fr/data/escape/cc_in2p3_dcache/KM3NET_ECAP_TGAL/a5/85/f62a38c9-f624-43f2-a997-db3c45d4eace
Looks to me like the error is "2021-07-02 07:18:47,159 DEBUG uploadclient.py stat: unexpected error=The requested service is not available at the moment.". Could you try another RSE? (I think EULAKE-1, sara and desy should be more or less always online :).
On the so issue: I think we should raise this upstream. I asked whether it is already known in the Rucio slack. Let's see whether they have a fix :) (e.g. a specific docker container for ARM).
Conclusion: this was already reported upstream to the GFAL who need to provide a new .so file :)
Alright, thanks ;) I also created an issue yesterday: https://github.com/rucio/containers/issues/143
We now have ARM packages for GFAL on EPEL, so technically it should now be possible (https://github.com/rucio/containers/issues/143#issuecomment-927712221)
I am using the
rucio-client
Docker image and I get the error below when uploading a file. It's complaining about a missingGRIDFTP PLUGIN
entry inlibgfal_plugin_gridftp.so
.