TheRoddyWMS / BatchEuphoria

A library to access different kinds of cluster backends
MIT License
3 stars 5 forks source link

PbsJobManager failed to read runtime statistics #55

Closed askask closed 7 years ago

askask commented 7 years ago

This exception occurred today in OTP:

2017-09-15 09:34:35,717 [taskScheduler-8] INFO  scheduler.ClusterJobMonitoringService  - 15227597 finished on Realm Realm 28110774 DKFZ_13.1 DAT
A_MANAGEMENT production
2017-09-15 09:34:35,972 [taskScheduler-8] WARN  scheduler.ClusterJobMonitoringService  - Failed to fill in runtime statistics for Cluster job 15
227597 on Realm 28110774 DKFZ_13.1 DATA_MANAGEMENT production with user otp
java.time.format.DateTimeParseException: Text cannot be parsed to a Duration
        at java.time.Duration.parse(Duration.java:412)
        at de.dkfz.roddy.execution.jobs.cluster.pbs.PBSJobManager$_processQstatOutput_closure15.doCall(PBSJobManager.groovy:776)
        at de.dkfz.roddy.execution.jobs.cluster.pbs.PBSJobManager.processQstatOutput(PBSJobManager.groovy:735)
        at de.dkfz.roddy.execution.jobs.cluster.pbs.PBSJobManager.queryExtendedJobStateById(PBSJobManager.groovy:591)
        at de.dkfz.tbi.otp.job.processing.ClusterJobSchedulerService.retrieveAndSaveJobStatistics(ClusterJobSchedulerService.groovy:153)
        at de.dkfz.tbi.otp.job.scheduler.ClusterJobMonitoringService$_check_closure5$_closure9.doCall(ClusterJobMonitoringService.groovy:132)
        at de.dkfz.tbi.otp.job.scheduler.ClusterJobMonitoringService$_check_closure5.doCall(ClusterJobMonitoringService.groovy:121)
        at de.dkfz.tbi.otp.job.scheduler.ClusterJobMonitoringService.check(ClusterJobMonitoringService.groovy:116)
        at de.dkfz.tbi.otp.job.scheduler.SchedulerService.clusterJobCheck(SchedulerService.groovy:301)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
askask commented 7 years ago

The duration string was "119:02:42"

vinjana commented 7 years ago

Bug should be fixed with commit 4309b77.