dmwm / PHEDEX

CMS data-placement suite
8 stars 18 forks source link

Agent not finding "self" #1038

Closed nikmagini closed 7 years ago

nikmagini commented 7 years ago

On 2016-07-26, from 3:16 AM to 14:32, the central BlockAllocator agent stopped doing work and started to print out the following alert on each cycle:

2016-07-26 14:31:33: BlockAllocator[13191]: alert: database error: Can't locate object method "Fatal" via package "self" (perhaps you forgot to load "self"?) at /data/ProdNodes/PHEDEX/perl_lib/PHEDEX/Core/Agent/DB.pm line 139.

The underlying cause seems to be the same as in https://github.com/dmwm/PHEDEX/issues/1012 - however in this case the agent wasn't even able to terminate itself because it didn't find self anymore...

Restarting the agent to check if this is cleared automatically.

nikmagini commented 7 years ago

My fault, I had put a syntax error when I committed the original fix for https://github.com/dmwm/PHEDEX/issues/1012 to github. I have now added a proper fix in https://github.com/dmwm/PHEDEX/commit/e0593fc461345cb09cfa2c5d593cc2e3b518f863 Also patching manually the central agents on vocms0214/0211

nikmagini commented 7 years ago

Fix released in 4.2.0 Also added more detalied logging in case of this error: https://github.com/dmwm/PHEDEX/commit/995b55e90dbf959b53c5ebe60346dc5a0c712ab1