broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
995 stars 360 forks source link

cromwell server qsub:command not found #5334

Open 1137498302 opened 4 years ago

1137498302 commented 4 years ago

version :cromwell-47

Architecture: DOCKER SGE docker-mysql

cromwell server Intermittent appearance qsub:command not found , server To get it back online

The relevant portion of cromwell.conf:

default = "SGE" providers {

Configure the SGE backend

            SGE {

                    # Use the config backend factory
                    actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
                    config {
                            #root =  "/GeneCloud001/server/cr_server/cromwell-exe"
                            #dockerRoot = "/GeneCloud001/server/cr_server/cromwell-exe"
                            #script-epilogue = "chmod -R a+rw * && chmod -R a+rw * && sync"
                            # Limits the number of concurrent jobs
                            #concurrent-job-limit = 500
                            # Define runtime attributes for the SGE backend.
                            # memory_gb is a special runtime attribute. See the cromwell README for more info.
                            runtime-attributes = """
                            Int cpu = 1
                            Float? memory_gb
                            String? sge_queue
                            String? sge_project
                            String docker = "jycloud/base:latest"
                            String? mnt_db_dir              ####数据库挂载目录
                            String? mnt_input_dir           ####输入bam挂载目录
                            String? mnt_out_dir             ####输出挂载目录
                            String docker_user = "$EUID"
                            String num_proc = 1
                            String? task_queue
                            String?  mount
                            """
                            submit-docker = """
                                    /opt/gridengine/bin/lx-amd64/qsub \
                                        -terse \
                                        -V \
                                        -N ${job_name} \
                                        -wd ${cwd} \
                                        -o ${out}.qsub \
                                        -e ${err}.qsub \
                                        -l ${"vf=" + memory_gb + "g"},p=${num_proc} \
                                        -b y docker run --rm  -v ${cwd}:${docker_cwd}  --user 1002  -m ${(memory_gb + (memory_gb / 2 )) + "G"}  --cpus ${num_proc}  -v ${mnt_db_dir}:${mnt_db_dir}:ro  -v ${mnt_out_dir}:${mnt_out_dir} -v /mnt/cache/sentieon:/mnt/cache/sentieon  ${mount}  ${docker} /bin/bash ${docker_script}
illusional commented 4 years ago

Hey, not part of the Cromwell team but thought I'd try to help out. To clarify, you've:

If this is correct, I'm struggling to understand the motivations behind it, but a few pointers: