broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
995 stars 360 forks source link

PBSPro Backend Configuration #4967

Open hp2048 opened 5 years ago

hp2048 commented 5 years ago

Hi We are trying to setup the cromwell + wdl for genomic analyses at the National Computational Infrastructure HPC facility in Australia. This HPC runs bespoke configured PBSPro. I have successfully managed to run "hello world" example workflow using the following configuration for the backend. However, I am unable to modify certain parameters as errors are thrown.

My current configuration is as follows:

   runtime-attributes = """
   Int cpu = 1
   Int memory = 1
   String raijin_queue = "express"
   String walltime = "01:00:00"
   String jobfs = "1GB"
   String raijin_project_id = "myproject"
   """
   #Submit string when there is no "docker" runtime attribute.
   submit = """
   qsub \
   -V \
   -N ${job_name} \
   -o ${out}.qsub \
   -e ${err}.qsub \
   -l ncpus=${cpu} \
   -l mem=${memory}"GB" \
   -l walltime=${walltime} \
   -l jobfs=${jobfs} \
   ${"-q " + raijin_queue} \
   -P ${raijin_project_id} \
   ${script}
   """

My specific questions:

  1. I have tried Float memory_gb = 1.0 as the runtime attribute and ${"-l mem=" + memory_gb + "GB"} as the submit string but this fails with qsub: Illegal attribute or resource value Resource_List.mem error. Could you please help me with the correct formatting of this attribute? I have copied structure of this from SGE.conf.
  2. I would like to use $PROJECT environment variable as the default value for raijin_project_id runtime attribute so that each user can run the same workflow without modification within their allocated project. Is there a way to use environment variable in the config file? I tried ${?PROJECT} and ${PROJECT} as per the recommendations for HOCON but to no avail. I am yet to understand the syntax of HOCON completely to solve this but your help at this time would be much appreciated.
  3. jobfs is a parameter used to control scratch space local to the execution node. Currently it is being passed as a string. Is there a way to convert that into GB same as memory but without the use of keyword memory? Thank you so much for your efforts. Hardip
kshakir commented 5 years ago

I have tried Float memory_gb = 1.0 as the runtime attribute and ${"-l mem=" + memory_gb + "GB"} as the submit string but this fails with qsub: Illegal attribute or resource value Resource_List.mem error.

Int memory = 1 is the equivalent of Int memory_b = 1 and is generating values in bytes. A WDL specifying gigs of memory will therefore generate very large values, with 4GB generating the string -l mem=4294967296"GB". If you navigate within the cromwell-executions directory and find the submit* files that contain the generated qsub command, you should see something like that.

cd to the directory, take the generated qsub command and try it on your cluster. Hopefully you get the same "Illegal attribute" error. Play around with the command until you get the correct syntax.

From there we can get your Cromwell config setup such it transforms the memory attribute into a valid syntax.

Some possible examples:

Example qsub usage Runtime Attribute Description
qsub -l mem=4.0GB … Float memory_gb = 1 decimal values allowed, units are two characters uppercase
qsub -l mem=4g … Int memory_gb = 1 integer values only, no decimals, and units must be one character lowercase
qsub -l mem=4000mb … Int memory_mb = 1000 integer values only, and it turns out gigabytes aren't even allowed as a unit, so use megabytes

I would like to use $PROJECT environment variable as the default value for raijin_project_id

Environment variables won't work within HOCON, but can be passed through down into the generated submit files. It will take a bit of escaping to get past WDL-draft2, as both POSIX and WDL-draft2 both use ${...} for variable names.

To escape past WDL-draft2, create two new runtime attributes and then use them in your submit.

Example:

runtime-attributes = """
String env_start="${"
String env_end="}"
# other variables here
"""

submit = """
qsub \
  -P ${env_start}PROJECT:-raijin_project_id${env_end} \
  ...
"""

jobfs is a parameter used to control scratch space local to the execution node. Currently it is being passed as a string

Unfortunately you cannot define a parameter as rich as memory. For custom attributes one only has the choice of Float, Int, or String. If you don't like String, you could use a Float and have the WDL use runtime { jobfs_gb: 4.0 }, or just runtime { jobfs: 4.0 } and tell everyone to always using gigabytes.

clhappyjiang commented 5 months ago

Hi! I am glad to see this issue, and I have also tried using PBS as the backend to run it. But I'm not very good at it. Can you show me how the configuration file for cromwell is defined when using PBS as the backend? Thank you.