DUNE / dist-comp

Action items for DUNE distributed computing, and common scripts that are used.
2 stars 0 forks source link

why schedd on Justin-prod-sched01 crashing and going in and out of the pool #128

Closed StevenCTimm closed 4 months ago

StevenCTimm commented 4 months ago

It appears to be stable now with MAX_JOBS_RUNNING = 16000 and mem of 48GB. 22000 was too much at that memory.

Chris is going to work on putting up a second one.