Closed ofaaland closed 1 year ago
@adammoody I'm not sure where we should have the discussion about the python version dependency - maybe at our next team meeting? But I wanted to call it out.
@adammoody I'm not sure where we should have the discussion about the python version dependency - maybe at our next team meeting? But I wanted to call it out.
We have python3.8 on toss systems, and it looks from https://forums.centos.org/viewtopic.php?t=74796 like python 3.8 is available in any rhel 8/ centos 8 system, which presumably means available nearly everywhere since rhel tracks so far behind on such things. But it would mean changing all the python scripts from
#! /usr/bin/env python3
...
to
#! /usr/bin/env python3.8
...
I could also investigate "getopt" and "optparse" python modules, which are both older than argparse and might have some other way to accomplish this without requiring a newer python.
... change from /usr/bin/env python3 to /usr/bin/env python3.8
On my current branch for testing I just made that change, to confirm everything works, but presumably that would be better handled via cmake at build time.
Added commit https://github.com/LLNL/scr/pull/520/commits/c8a08401b9c05fb5063db79e7ba5aa296514df52 to detect the python interpreter at build time using cmake and then use that. We can discuss other options.
Updated commit fixed incorrect indenting of return statement in launchers/flux.py:parsefluxargs().
@adammoody I think the root of the issue is that for flux, scr is trying to use the flux python interface to launch a job rather than just running "flux mini run ..." as specified by the user. This saves creating one process, but then requires we parse the command line, in the same way "flux mini" would parse it. This seems fragile, and therefore a mistake, to me.
I suspect the (python) launchers for slurm, etc. in scr do not try to do this, and that's why I ran across this problem when I started testing flux. I'll confirm that.
@adammoody OK, I confirmed that the scr flux launcher calls a flux-provided python interface to launch the job, and that interface requires we provide # of nodes, # of tasks per node, etc. explicitly. Since the scr launcher launchruncmd() takes a command line type string or a list of strings (ie argv) as its input, this means that by using the flux python interface to launch, we have to parse the user's args to extract nodes, tasks, etc.
The other launchers (ie slurm) just call subprocess.Popen() and pass the user's argv list in, requiring no parsing.
I think it would be better for the flux launcher's implementation of launchruncmd() to use subprocess.Popen() like the other launchers. We could provide an additional interface that allows the user to specify nodes, tasks-per-node, etc. that then calls the flux interface.
Thanks, @ofaaland . Yes, I agree. I'm also in favor of avoiding parsing if we can do it. Just passing the user's string to the underlying command will be more reliable and future-proof.
Superseded by #523
The handwritten argument parsing code handled the form --option=value
but not equivalent forms such as -o value -ovalue --option value
Use argparse instead. Depends on python >= 3.7 to handle flux options before user script argument and user script options (if any).
Update list of arguments handled to include "exclusive" and "gpus-per-task" which are required by the flux interface.