lanl / BEE

Other
13 stars 3 forks source link

Need a test for when "beeflow core stop" is run while a workflow is running for slurm. #849

Open pagrubel opened 3 weeks ago

pagrubel commented 3 weeks ago

We need some type of automatic test for when "beeflow core stop" is run while a workflow is running. This is the manual test:

To test this for CLI (on darwin) you may want two screens both in the poetry env for this branch:

Make sure use_commands in the slurm portion of bee.conf is True. Start beeflow: "beeflow core start" submit a workflow, clamr example works well, the checkpoint workflow is also good "watch -n 2 query " to make sure the task is pending or running. It is helpful to do this in a separate screen and keep it up even when you stop beeflow, or when you restart it verify that the clamr step is running and get the job id via "squeue -u " or the task manager log issue "beeflow core stop" while clamr step is running "watch -n 5 show job " Wait until this gives a no job type error, this may take tool long on some systems Then start beeflow up again "beeflow core start". The query screen will show that clamr has completed. The workflow has been paused so to finish it just submit the "beeflow resume " command

To test for slurmrestd:

Make sure use_commands in the slurm portion of bee.conf is False. Start beeflow: "beeflow core start" submit a workflow, clamr example works well, the checkpoint workflow is also good "watch -n 2 query " to make sure the task is pending or running. It is helpful to do this in a separate screen and keep it up even when you stop beeflow, or when you restart it verify that the clamr step is running and get the job id via "squeue -u " or the task manager log issue "beeflow core stop" while clamr step is running "watch -n 5 squeue -u " Wait until the clamr job is off the screen Then start beeflow up again "beeflow core start". The query screen will show that clamr has completed. The workflow has been paused so to finish it just submit the "beeflow resume " command