grailbio / reflow

A language and runtime for distributed, incremental data processing in the cloud
Apache License 2.0
964 stars 52 forks source link

Fresh docker images not pulled #91

Closed olgabot closed 5 years ago

olgabot commented 5 years ago

Hello, I've re-created docker images that contain a file located at /usr/local/etc/openmpi-hostfile-aws, but Reflow doesn't seem to be pulling the latest one, which definitely has that file copied into it.

``` ✘  Tue 27 Nov - 05:52  ~/tick-genome/reflow   origin ☊ olgabot/abyss2 ✔   make tick1_assembly sudo reflow run -local \ -localdir /mnt/data/reflow-tmp/ \ assemble.rf \ -read1=s3://tick-genome/dna/2018-06-28/pre-assembly_v3/tick_1_S1_R1_post-trimming.fastq.gz \ -read2=s3://tick-genome/dna/2018-06-28/pre-assembly_v3/tick_1_S1_R2_post-trimming.fastq.gz \ -output=s3://tick-genome/dna/2018-06-28/assembly_k75/ \ -id=tick_1_S1 \ -ksize=81 sudo: unable to resolve host olgabot-assemble.aegea reflow: run ID: 287b2609 abyss.rf:143:15(abyss.AssemblePairedEnd.d): dir("tick_1_S1_R1.fastq.gz": file(sha256=sha256:5e8fe2ac24ab6612551f581e6480dd50c6d2156b4e9c1f3bcb1bf169baf6c251, size=118169286374), "tick_1_S1_R2.fastq.gz": file(sha256=sha256:755eb071a87d80327560c51f84247bc93907c7e5b6acff525f633f0ac2e2663d, size=122586077922)) reflow: <- abyss.AssemblePairedEnd db228585 err exec 0s 0B error exec sha256:db228585e3e26c4e94ccabb7bdefe55e45cd75608d21a27d9e1f72db336c821d: exited with code 2 abyss.rf:144:9 /db228585e3e26c4e94ccabb7bdefe55e45cd75608d21a27d9e1f72db336c821d czbiohub/abyss2 command: cd {{outdir}} abyss-pe np=60 j=60 name=tick_1_S1 k=81 \ mpirun='/usr/local/bin/mpirun --allow-run-as-root --hostfile /usr/local/etc/openmpi-hostfile-aws' \ in='{{d}}/tick_1_S1_R1.fastq.gz {{d}}/tick_1_S1_R2.fastq.gz' where: {{d}} = tick_1_S1_R1.fastq.gz sha256:5e8fe2ac24ab6612551f581e6480dd50c6d2156b4e9c1f3bcb1bf169baf6c251 110.1GiB tick_1_S1_R2.fastq.gz sha256:755eb071a87d80327560c51f84247bc93907c7e5b6acff525f633f0ac2e2663d 114.2GiB {{d}} = tick_1_S1_R1.fastq.gz sha256:5e8fe2ac24ab6612551f581e6480dd50c6d2156b4e9c1f3bcb1bf169baf6c251 110.1GiB tick_1_S1_R2.fastq.gz sha256:755eb071a87d80327560c51f84247bc93907c7e5b6acff525f633f0ac2e2663d 114.2GiB stdout: /usr/local/bin/mpirun --allow-run-as-root --hostfile /usr/local/etc/openmpi-hostfile-aws -np 60 ABYSS-P -k81 -q3 --coverage-hist=coverage.hist -s tick_1_S1-bubbles.fa -o tick_1_S1-1.fa /arg/1/0/tick_1_S1_R1.fastq.gz /arg/2/0/tick_1_S1_R2.fastq.gz stderr: dirname: missing operand Try 'dirname --help' for more information. -------------------------------------------------------------------------- Open RTE was unable to open the hostfile: /usr/local/etc/openmpi-hostfile-aws Check to make sure the path and filename are correct. -------------------------------------------------------------------------- -------------------------------------------------------------------------- An internal error has occurred in ORTE: [[54841,0],0] FORCE-TERMINATE AT (null):1 - error base/ras_base_allocate.c(302) This is something that should be reported to the developers. -------------------------------------------------------------------------- make: *** [/usr/local/bin/abyss-pe.Makefile:548: tick_1_S1-1.fa] Error 1 profile: cpu mean=0.0 max=0.0 mem mean=0B max=0B disk mean=0B max=0B tmp mean=0B max=0B exec sha256:db228585e3e26c4e94ccabb7bdefe55e45cd75608d21a27d9e1f72db336c821d: exited with code 2 Makefile:66: recipe for target 'tick1_assembly' failed make: *** [tick1_assembly] Error 11 ```

Relatedly, when I list the images for that workflow, I get nothing!

 ✘  Tue 27 Nov - 05:52  ~/tick-genome/reflow   origin ☊ olgabot/abyss2 ✔ 
  reflow images assemble.rf                                                   
usage of assemble.rf:
  -id string
        name of the sample (required)
  -ksize int
        K-mer size for assembly (required)
  -output string
        S3 folder to put all the fastqs and reports (required)
  -read1 string
        S3 path to a single fastq(.gz) file (required)
  -read2 string
        S3 path to a single fastq(.gz) file (required)

 ✘  Tue 27 Nov - 05:53  ~/tick-genome/reflow   origin ☊ olgabot/abyss2 ✔ 
  reflow images abyss.rf   
(no output)

Where the files contain the contents below.

assemble.rf

// Assemble paired-end Illumina sequencing data

param (
    // S3 path to a single fastq(.gz) file
    read1 string

    // S3 path to a single fastq(.gz) file
    read2 string

    // name of the sample
    id string

    // S3 folder to put all the fastqs and reports
    output string

    // K-mer size for assembly
    ksize int
)

// System modules
val dirs = make("$/dirs")

val abyss = make("./abyss.rf")

r1 := file(read1)
r2 := file(read2)

val abyss_output = abyss.AssemblePairedEnd(
    r1, r2, id, ksize)

val Main = dirs.Copy(abyss_output, output)

abyss.rf

val dirs = make("$/dirs")

threads := 60

val abyss = "czbiohub/abyss2"

func AssemblePairedEnd(read1, read2 file, id string, ksize int) = {
    d := dirs.Make([id+"_R1.fastq.gz": read1, id+"_R2.fastq.gz": read2])

    d := trace(d)
    exec(image := abyss, cpu := threads, mem := 230*GiB) (outdir dir) {"
        cd {{outdir}}
        abyss-pe np={{threads}} j={{threads}} name={{id}} k={{ksize}} \
                mpirun='/usr/local/bin/mpirun --allow-run-as-root --hostfile /usr/local/etc/openmpi-hostfile-aws' \
                in='{{d}}/{{id}}_R1.fastq.gz {{d}}/{{id}}_R2.fastq.gz'

    "} 
}

Do you know what may be happening? Thank you! Warmest, Olga

mariusae commented 5 years ago

Looks like the image name did not change. when this is the case, Reflow assumes the image represents a stable name. We will likely change the default behavior of this soon to instead use the fully resolved image, but this is not without problem because image resolution isn't always cheap.

olgabot commented 5 years ago

Ah okay I see. I can add the tag as well.