Closed vyasr closed 4 years ago
There is one specific case that will require some extra work to enable, namely the offset-based bundling of multiple MPI jobs onto one node on stampede. Currently we enable this by looping and using python project.py exec
in the template script. @b-butler and I discussed that to enable this properly we will probably need to remove all such looping from the template and instead enable it directly in run. However, this means that we may also need to generalize the way in which we generate the commands executed by python project.py run
to naturally enable environments to perform specific modifications to the run command such as adding the offset for stampede. One possibility for enabling this is to create a ComputeEnvironment.run
analogous to ComputeEnvironment.submit
that would allow the command to be overridden by individual environments. I haven't spent too much time thinking about this yet but wanted to log my thoughts before forgetting; I'm open to other thoughts on how to implement this as well.
Partially resolved by #208. The question I raised in my previous comment remains to be addressed, but that probably will be done as part of #114.
The part of this that remains unresolved is currently very specific to Stampede2 and is separately documented in #250, so this issue can be closed.
Feature description
Currently
FlowProject.run
ignores execution directives, i.e. operations marked with@flow.directives(nranks=8)
will run serially unless they are actually submitted to a scheduler. Instead,FlowProject.run
should properly handle these directives.Proposed solution
This change depends on #174, which will provide the require MPI commands (and potentially other execution directive-specific commands for e.g. OpenMP). Once that issue is resolved, we can change run to call the necessary function to modify the run command. The other important change will be that
FlowProject.run
now needs to decide whether or not to fork based on an additional set of criteria that accounts for these directives.Additional context
Making this change is critical to enabling groups #114.