dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
289 stars 136 forks source link

Srmmanager crenditals corruption verification #7484

Open cfgamboa opened 9 months ago

cfgamboa commented 9 months ago

Dear all

We have been observed staging transfers failing with this type of error.

Error reason: STAGING [70] srm-ifce err: Communication error on send, err: [SE][BringOnline][] httpg://dcfrontend.usatlas.bnl.gov:8443/srm/managerv2: CGSI-gSOAP running on fts-atlas-008.cern.ch reports Error initializing context GSS Major Status: Authentication Failed GSS Minor Status Error Chain: globus_gsi_gssapi: SSL handshake problems OpenSSL Error: s3_clnt.c:1264: in library: SSL routines, function ssl3_get_server_certificate: certificate verify failed globus_gsi_callback_module: Could not verify credential globus_gsi_callback

While we can confirm the service certificates are valid, it is not clear for us the health of the SRM credentials and their related entries in the database.

We started to observe this after a site wide network intervention. The SE was not brought down during the network intervention.

Any procedure to check the state of the credential, clean or reset them.

All the best, Carlos

lemora commented 9 months ago

Hi Carlos.

Do you see errors in dCache's srmmanager logs?

Kind regards Lea

cfgamboa commented 9 months ago

Hey Lea,

No we were not able to observed errors there.

All the best, Carlos

On Dec 21, 2023, at 11:25 AM, Lea @.***> wrote:

Hi Carlos.

Do you see errors in dCache's srmmanager logs?

Kind regards Lea

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7484#issuecomment-1866602722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO3S2U6NV7NPI3ULHKDYKRPJJAVCNFSM6AAAAABA6QBY4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGYYDENZSGI. You are receiving this because you authored the thread.

cfgamboa commented 9 months ago

For reference I am adding here the FTS transfer https://fts3-atlas.cern.ch:8449/fts3/ftsmon/#/job/204dccd0-9fc9-11ee-8147-fa163eb580d6

On Dec 21, 2023, at 11:49 AM, Carlos Fernando Gamboa @.***> wrote:

Hey Lea,

No we were not able to observed errors there.

All the best, Carlos

On Dec 21, 2023, at 11:25 AM, Lea @.***> wrote:

Hi Carlos.

Do you see errors in dCache's srmmanager logs?

Kind regards Lea

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7484#issuecomment-1866602722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO3S2U6NV7NPI3ULHKDYKRPJJAVCNFSM6AAAAABA6QBY4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGYYDENZSGI. You are receiving this because you authored the thread.