Closed StevenCTimm closed 5 months ago
It's not clear if this is a HTCondor issue or a JustIN issue, but investigation shows that 90-95% of the jobs that JustIN is marking as "stalled" in fact are running to completion. More investigation is needed to see why.
This was a transitory problem with the HTCondor schedds overloaded compounded by faulty checking of HTCondor return codes which was fixed in 01.01
It's not clear if this is a HTCondor issue or a JustIN issue, but investigation shows that 90-95% of the jobs that JustIN is marking as "stalled" in fact are running to completion. More investigation is needed to see why.