getwilds / proof

MIT License
0 stars 0 forks source link

Glob not working on Cromwell #8

Closed Sanaz01 closed 1 month ago

Sanaz01 commented 1 month ago

glob feature to capture all files with similar pattern is not working on PROOF/ Cromwell. Here are the inputs and outputs:

Input WDL: glob_test.wdl


workflow GlobTest{
    input{
        String name_list
    }

    call make_files{
        input:
            names=name_list
    }

    output {
        Array[File] = outfiles = make_files.output_files
    }

}

task make_files{
    input {
        String names
    }

    command <<<
        set -eo pipefail
        IFS=','
        for name in ${names}
        do
            echo ${name} >> file_${name}.txt
        done;
        wait
    >>>

    runtime {
        docker: "ubuntu:noble-20240114"
        cpu: "1"
        memory: "1 G"
    }
    output{
        Array[File] output_files = glob("file_*.txt")
    }
}

Input Json: glob_test.json

{
    "GlobTest.name_list": "Harry,Sally,Tammy"
}

Error in Troubleshoot tab in PROOF

[1] "Could not process output, file not found: /hpc/temp/paguirigan_a/user/sagarwa2/cromwell-scratch/GlobTest/736d876b-3673-4f9d-a9ce-81c7693b3dd5/call-make_files/execution/glob-d667704679d03197544d1107735ba61b/file_*.txt"
sckott commented 1 month ago

Thanks @Sanaz01 Does it work when you don't use PROOF?

Sanaz01 commented 1 month ago

@sckott I have not tested this WDL workflow via command line.

sckott commented 1 month ago

okay, thanks. I'll see what I can find out

sckott commented 1 month ago

I was able to reproduce it, but had to fix a few things to get it to run. I don't know what the issue is but I'm pinging Sita to see if she might know

version 1.0

workflow GlobTest{
    input{
        String name_list
    }

    call make_files{
        input:
            names=name_list
    }

    output {
        Array[File] outfiles = make_files.output_files
    }

}

task make_files{
    input {
        String names
    }

    command <<<
        set -eo pipefail
        IFS=','
        for name in ${names}
        do
            echo ${name} >> file_${name}.txt
        done;
        wait
    >>>

    runtime {
        docker: "ubuntu:noble-20240114"
        cpu: 1
        memory: "1 G"
    }
    output{
        Array[File] output_files = glob("file_*.txt")
    }
}
sckott commented 1 month ago

Asking in slack on a related issue from May 2024

dtenenba commented 1 month ago

I think I can reproduce this without PROOF. I used Scott's wdl (with the docker: line commented out since I am running with no config which sets up apptainer) and Sanaz's json file.

I did it like this:

ml cromwell
java -jar /app/software/cromwell/87/cromwell.jar run -i globtest2.json globtest2.wdl

The job succeeded but did not produce any output files called file_Harry.txt etc. And this was just running in my home directory.

Also, it printed this out (among other things):

{
  "outputs": {
    "GlobTest.outfiles": []
  },
  "id": "102a5597-cb25-4eca-ad56-da9088f69774"
}

Not sure if that is expected/correct output.

Anyway, the string Harry does not appear anywhere in the running directory or underneath it (except in the input json file).

Sanaz01 commented 1 month ago

Update The earlier code was not indicative of exact error faced using glob. The following code gives the precise error in question. glob_test.wdl

version 1.0

workflow GlobTest{
    input{
        Array[String] name_list
    }

    call make_files{
        input:
            names=name_list
    }

    output {
        Array[File] outfiles = make_files.output_files
    }

}

task make_files{
    input {
        Array[String] names
    }

    command <<<
        set -eo pipefail
        names=(~{sep=' ' names})
        for name in "${names[@]}"
        do
            echo ${name} >> file_${name}.txt
        done;
        wait
    >>>

    runtime {
        docker: "ubuntu:noble-20240114"
        cpu: 1
        memory: "1 G"
    }
    output{
        Array[File] output_files = glob("file_*.txt")
    }
}

glob_test.json

{
    "GlobTest.name_list": ["Harry", "Sally", "Tammy"]
}

Screenshot of call-make_files directory image

Screenshot of Troubleshoot tab in PROOF image

sckott commented 1 month ago

@Sanaz01 So does this glob example work now with the merging of that PR above?

Sanaz01 commented 1 month ago

@sckott testing it in the proof-api-dev

Sanaz01 commented 1 month ago

@sckott testing complete. glob works! We can push it to the main PROOF

sckott commented 1 month ago

great

tefirman commented 1 month ago

šŸŽ‰šŸŽ‰šŸŽ‰ Awesome news! Thanks @dtenenba @sckott @Sanaz01 !!!