Closed wihobbs closed 2 years ago
I'm setting the base branch as python3
to isolate the changes that I made. When I have the okay from @dawson6 I'll work out the kinks between my branch and main
.
Thanks, @jameshcorbett, and sorry for the delay in pushing this -- still finding my way around in git. I think I addressed all of your comments (hopefully more to come ;))
If you wanted to, you could then read the jobspec that Flux constructed, and verify that the numbers check out.
Upon further thought I think that's a good idea but I would actually use Flux's jobspec class to read in the jobspec that was constructed. Then you don't need to hard-code any assumptions about jobspec layout. And I had forgotten that you won't be able to import the FluxScheduled
class unless you have the flux
package, so the check of whether Flux is installed is somewhat redundant.
Thanks @jameshcorbett. Using the jobspec class to test seems like an excellent idea and takes it one step further than I thought. I will work on this tomorrow and see if I can get it running in one of the Flux Docker containers. I will test both valid and invalid jobspecs.
And, thinking more deeply, I wonder if validating a jobspec should be part of calculateCommandList
, so that invalid jobs are automatically skipped by ATS and never submitted to the queue. I will add this to our discussion for Thursday and we can see what @davidbloss and @dawson6 would prefer.
Update: the jobspec verification wouldn't provide any additional functionality. On reflection, it would stop jobs from being submitted to the queue but this wouldn't provide a justifiable speedup compared to the amount of implementation work/code.
Well groan. I'm sure I did something wrong to make the working tree this out of whack, but this should be aligned with the main
branch.
Making a note to myself that this is what I did and in the future I should not do this again (it seems like rebase
should do this better but I must be using it wrong)
(fluke_ats) [hobbs17@corona173:ATS]$ git config pull.rebase true
(fluke_ats) [hobbs17@corona173:ATS]$ git pull -X theirs upstream main
Tomorrow I'll go through machines.py
and the commit history and make sure none of the commits from @dawson6 and @davidbloss change any of my implementation for Flux (but at first glance they shouldn't). And find a way to make my branch's commit history...less yucky.
All right, I believe I fixed the commit so that it will not have merge conflicts and can be merged into main once WIP is dropped. Making a note here that these are the pending actions for dropping WIP:
atsflux
binding with additional functionality I believe at this point I have addressed all of the must-haves for ATS/Flux but will run more tests on Monday to be sure.
I removed work-in-progress because I believe we are good for final review here. Thanks @dawson6, @jameshcorbett, and @davidbloss -- I believe I have addressed all of your comments from here and our meetings.
Per our discussion today with @dawson6, @davidbloss, and @jameshcorbett, this PR is opened to test the compatibility of a work-in-progress Flux executor and monitor the progress of expanding ATS compatibility for Flux.
Currently, the Flux executor has option compatibility for:
To test this on Ruby (or another slurm-runnning LC cluster), install ATS in a virtual environment and request an allocation from slurm:
salloc -N 3 -p pdebug -A cbronze -t 60
And then start a flux instance using
srun
:srun -N 3 --pty flux start