From our current experience with iterative workflows, it is clear that sometimes a large number of jobs must be submitted (i.e. the number of tasks that are iterated, multiplied by the number of iterations specified, plus the number of non-iterated tasks). This could be problematic if the scheduler configuration limits the maximum number of jobs a user may schedule. What's more, there is not currently a way to continue iterating until some criterion is met. To solve these problems, we should implement a server process that is able to submit more iterations, when existing iterations are completed.
The server should be started when a workflow is submitted (if not already started by an existing workflow) and should be killed when there are no longer any running workflows. On HPC, this would run on the login node, since it must be able to submit jobs.
For now, this would more aptly be named a "helper process" rather than a server, since there won't be any clients connecting with requests to serve. Instead, we can use a polling procedure: in each polling step, a number of items can be checked. For example. of the known running workflows, do we need to submit more jobs to the scheduler.
(A "proper" implementation of this would be a daemon on Unix and a service on Windows, but a simple polling implementation should be sufficient for now.)
From our current experience with iterative workflows, it is clear that sometimes a large number of jobs must be submitted (i.e. the number of tasks that are iterated, multiplied by the number of iterations specified, plus the number of non-iterated tasks). This could be problematic if the scheduler configuration limits the maximum number of jobs a user may schedule. What's more, there is not currently a way to continue iterating until some criterion is met. To solve these problems, we should implement a server process that is able to submit more iterations, when existing iterations are completed.
The server should be started when a workflow is submitted (if not already started by an existing workflow) and should be killed when there are no longer any running workflows. On HPC, this would run on the login node, since it must be able to submit jobs.
For now, this would more aptly be named a "helper process" rather than a server, since there won't be any clients connecting with requests to serve. Instead, we can use a polling procedure: in each polling step, a number of items can be checked. For example. of the known running workflows, do we need to submit more jobs to the scheduler.
(A "proper" implementation of this would be a daemon on Unix and a service on Windows, but a simple polling implementation should be sufficient for now.)