DUNE / dist-comp

Action items for DUNE distributed computing, and common scripts that are used.
2 stars 0 forks source link

All AWT writes to MANCHESTER and QMUL failing with error code 99 #130

Closed StevenCTimm closed 4 months ago

StevenCTimm commented 4 months ago

Normally we would look at the wrapper job log to see where these are failing but none of the logs are available for whatever reason. see

https://justin.dune.hep.ac.uk/dashboard/?method=show-job&jobsub_id=174550.0@justin-prod-sched01.dune.hep.ac.uk

the stderr that would normally be there, isn't there.

StevenCTimm commented 4 months ago

It is possible at least in the case of manchester that this is a disk full error. But not in the case of QMUL where there is plenty of space.

wyuan-uoe commented 4 months ago

Manchester can work when I xrdcp or gfal-copy to MANCHESTER.

I'll issue a ticket to QMUL.

Andrew-McNab-UK commented 4 months ago

Looks like this was resolved by fixing #131

StevenCTimm commented 4 months ago

yes qmul working again too, they answered the ticket.