aiidateam / aiida-hyperqueue

AiiDA plugin for the HyperQueue metascheduler.
http://aiida-hyperqueue.readthedocs.io/
MIT License
5 stars 7 forks source link

Use both time-limit and time-request #9

Closed giovannipizzi closed 2 years ago

giovannipizzi commented 2 years ago

This comment justifies why only time-request was used. https://github.com/aiidateam/aiida-hyperqueue/blob/e33376c04b456d4e7f440fff0356f66fbd7c60da/aiida_hyperqueue/scheduler.py#L125-L130

However, I think both should be used, and set to the same value. It's expected that schedulers kill the job if this takes too long. Actually, this is even more important when sharing a node: I just had a case in which, for some reasons, all jobs in a node remained stuck and stopped producing output, even if they were still using 100% of the CPU. They blocked the worker until the end of its wall time. This means that if e.g. the worker has 24h of wall time, 24h are wasted even if the job should finish within 10 minutes. It's better to kill it and let other jobs go in.