Closed marangiop closed 3 years ago
The name of the error and output files is controlled by the #SBATCH -e and #SBATCH -o options in the launch script, those options are set on croupier here
As you can see, it uses the job_name by deafult if stderr_file or stdout_file options are not present in job_options.
There are 2 possible solutions:
To implement option1, change this line
Thank you for the suggestions! @jramosrivas
I did a quick debugging in PyCharm and I can see that the second part of job name (instance_components[-1]
) is based on a variable instance_components
that collects the output of the command instance.id.split('_')
. But as you can see, the content of of instance.id
is set somewhere else, before we reach this line inside workflows.py
But yes, you are right. We need a way to inject in line 75 that self.name
(i.e. the job name) should have the same name as the one stated in the blueprint yaml file
Solved.
self.name = '_'.join(instance_components[:-1])
gives the same job name as written in the blueprint yaml file
This seems to be working with a local tox test, but not from Cloudify GUI (after uploading the new croupier .wgn file containing the change in line 75 of workflows.py
)
Solved by introducing the change in permedcoe
branch
Is your feature request related to a problem? Please describe. Croupier/Cloudify atuomatically assigns a random ID to each job after a given job has started being executed through the run_jobs workflow. This is visible under the "Deployment Outputs/Capabilities" tab when you click on a Deployment. This ID is made up by the string "atos" followed by a combination of 6 random letters and numbers.
This become problematic when you want to inspect the .err and .out files associated with each job of the workflow. As we know, the install workflow creates a directory for a given workflow inside the target HPC cluster. When the run_job workflow is started, that directory is populated with a .script file for each job.
As each job is executed using the respective .script file created by Croupier inside the HPC system, the information generated during the execution of a given job is logged into two files, a .err and .out file (see SBATCH -e and -o flags inside .script file).These files are automatically named based on the ID that was assigned to them by Croupier as I explained above. If your workflow contains several jobs, then inspecting the log files become quite painful, because it's not clear from the name of the .err and .out files to what job inside the blueprint they belong to.
Describe the solution you'd like The solution is that Croupier assigns a sensible name to the .err and .out file. This could be the name of the individual job as written inside the blueprint.
For example, if the job inside the blueprint is named "job_1", then the .err and .out files created inside HPC should be job_1.err and job_1.out.