LLNL / merlin

Machine Learning for HPC Workflows
MIT License
118 stars 26 forks source link

feature/vlauncher #447

Closed bgunnar5 closed 11 months ago

bgunnar5 commented 11 months ago

Added a new feature that was requested by @koning for the ICECap team. This feature is the $(VLAUNCHER) command. Instead of reading allocation configuration from the step.run block, it instead reads specific shell variables. The variables that it reads are:

Currently, the GPU option only works with flux. However, I'll be modifying the script adapters soon and this will be fixed.

Also, I fixed a minor naming error that happened on iterative workflows. This bug caused filenames for the .out, .partial, and .expanded files to continuously append .out, .partial, or .expanded (depending on which was used in the iteration step) to the name of the file.

koning commented 11 months ago

@bgunnar5 Thanks this is a good start. I added a new branch, PR_447, that will use the variable names instead of the variable values. The values may not be set until the script is run, so they cannot be replaced when the task is defined.

bgunnar5 commented 11 months ago

@bgunnar5 Thanks this is a good start. I added a new branch, PR_447, that will use the variable names instead of the variable values. The values may not be set until the script is run, so they cannot be replaced when the task is defined.

This is a good point. When I did this last Thursday I just set it up to use the most recently defined MERLIN_* variables but that might not always be the case. Did you have a chance to test out your changes on the PR_447 branch and do they accomplish what you need?

koning commented 11 months ago

Yes, the PR_447 branch will replace the names and not values and also check for csh to set the variables.

koning commented 11 months ago

Would you please merge the PR447 branch into this PR and add the step defaults as the MERLIN defaults?

bgunnar5 commented 11 months ago

@koning yes I will. I think it will also be a good idea to add something to check if the line with MERLIN_* defined is commented out and if it is then ignore it