DIRACGrid / DIRAC

DIRAC Grid
http://diracgrid.org
GNU General Public License v3.0
113 stars 174 forks source link

[v8.0.48] Which proxy are we talking about here ? #7665

Closed marianne013 closed 3 months ago

marianne013 commented 3 months ago

Hi,

I'm trying to work out why the dirac-dms-replicate-and-register command seems to avoid using our FTS server. I am sure it used to work. Hrmpf. In the process of doing this I fished the following error out of DataManagement/FTS3Agent. Any idea how I can convince it to tell me which proxy it is talking about ? This is on the pre-prod server and the request (there should only be one) was submitted with the same proxy it managed to submit a bunch of jobs with, so as far as I cam concerned it should be happy.

2024-06-10 13:38:45 UTC DataManagement/FTS3Agent ERROR: Error generating context ClientError('Client error: Bad request: Could not process the proxy: Failed to verify the proxy, maybe signed with the wrong private key?')
Traceback (most recent call last):
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/delegator.py", line 291, in delegate
    self._put_proxy(delegation_id, x509_proxy_pem)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/delegator.py", line 242, in _put_proxy
    self.context.put("/delegation/" + delegation_id + "/credential", x509_proxy)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/context.py", line 220, in put
    return self._requester.method(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/request.py", line 128, in method
    self._handle_error(url, response.status_code, response.text)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/request.py", line 75, in _handle_error
    raise ClientError("Bad request: " + message)
fts3.rest.client.exceptions.ClientError: Client error: Bad request: Could not process the proxy: Failed to verify the proxy, maybe signed with the wrong private key?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/dirac/DIRAC/src/DIRAC/DataManagementSystem/Client/FTS3Job.py", line 787, in generateContext
    fts3.delegate(context, lifetime=td_lifetime, delegate_when_lifetime_lt=td_lifetime // 3)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/easy/delegate.py", line 40, in delegate
    return delegator.delegate(lifetime, force, delegate_when_lifetime_lt)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/delegator.py", line 296, in delegate
    raise ClientError(str(e)).with_traceback(sys.exc_info()[2])
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/delegator.py", line 291, in delegate
    self._put_proxy(delegation_id, x509_proxy_pem)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/delegator.py", line 242, in _put_proxy
    self.context.put("/delegation/" + delegation_id + "/credential", x509_proxy)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/context.py", line 220, in put
    return self._requester.method(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/request.py", line 128, in method
    self._handle_error(url, response.status_code, response.text)
  File "/opt/dirac/diracos/lib/python3.11/site-packages/fts3/rest/client/request.py", line 75, in _handle_error
    raise ClientError("Bad request: " + message)
fts3.rest.client.exceptions.ClientError: Client error: Client error: Bad request: Could not process the proxy: Failed to verify the proxy, maybe signed with the wrong private key?
2024-06-10 13:38:45 UTC DataManagement/FTS3Agent/treatOperation/121 ERROR: Could not get context {'OK': False, 'Errno': 0, 'Message': "ClientError('Client error: Bad request: Could not process the proxy: Failed to verify the proxy, maybe signed with the wrong private key?')", 'CallStack': ['  File "/opt/dirac/diracos/lib/python3.11/threading.py", line 1002, in _bootstrap\n    self._bootstrap_inner()\n', '  File "/opt/dirac/diracos/lib/python3.11/threading.py", line 1045, in _bootstrap_inner\n    self.run()\n', '  File "/opt/dirac/diracos/lib/python3.11/threading.py", line 982, in run\n    self._target(*self._args, **self._kwargs)\n', '  File "/opt/dirac/diracos/lib/python3.11/multiprocessing/pool.py", line 125, in worker\n    result = (True, func(*args, **kwds))\n', '  File "/opt/dirac/DIRAC/src/DIRAC/DataManagementSystem/Agent/FTS3Agent.py", line 484, in _treatOperation\n    res = self.getFTS3Context(ftsJob.username, ftsJob.userGroup, ftsServer, threadID=threadID)\n', '  File "/opt/dirac/DIRAC/src/DIRAC/DataManagementSystem/Agent/FTS3Agent.py", line 206, in getFTS3Context\n    res = FTS3Job.generateContext(ftsServer, proxyFile, lifetime=self.proxyLifetime)\n', '  File "/opt/dirac/DIRAC/src/DIRAC/DataManagementSystem/Client/FTS3Job.py", line 792, in generateContext\n    return S_ERROR(repr(e))\n']}

Possibly one for @chaen to comment ?

chaen commented 3 months ago

This would be the proxy used to connect to the FTS3 server. Printouts in FTS3Job.generateContext would tell you more.

marianne013 commented 3 months ago

So that gets me:

https://fts00.grid.hep.ph.ic.ac.uk:8446
/opt/dirac/work/DataManagement/FTS3Agent/1718028385_daniela.bauer_lz_user_fts00.grid.hep.ph.ic.ac.uk_Thread-18 (worker).pem
43200

Looking at the name, it's an LZ proxy, and it's in my name, which is as expected, the lifetime is 43200 s. (I just printed the input to FTS3Job.generateContext). I ran: openssl x509 -in "/opt/dirac/work/DataManagement/FTS3Agent/1718028385_daniela.bauer_lz_user_fts00.grid.hep.ph.ic.ac.uk_Thread-18 (worker).pem" -noout -text and got no complaint. Any idea what else I can check ? @sfayer: FYI

chaen commented 3 months ago

The error would be on the server side, do you have access to the logs ? Is it a recent release of FTS? It could be that the proxy is too short, I think the minimum is now 2048

marianne013 commented 3 months ago

The proxy is 2048. I have access to the server logs, but I can't find any mention in there, unless I am looking in the wrong place. It is a very recent version of FTS, in fact it has been very recently been upgraded to Rocky9 :-S

chaen commented 3 months ago

Ah ! is there Sha1 anywhere in the chain ?

chaen commented 3 months ago

https://twiki.cern.ch/twiki/bin/view/LCG/EL9vsSHA1CAs

marianne013 commented 3 months ago

No, it's an honest to goodness UK CA based proxy which is not on the naughty list.

marianne013 commented 3 months ago

So Simon just found a SHA1 somewhere (at the very end of the chain). At what point (version) did diracos stop generating those ?

marianne013 commented 3 months ago

We were running diracos2.38 which had 'sha1' at /opt/dirac/diracos/lib/python3.1/site-packages/fts3/rest/client/delegator.py (~ line 237). Upgrading to diracos2.42 fixed the issue. Having said this 2.38 is from February, why was this still around in February this year ?!