grailbio / reflow

A language and runtime for distributed, incremental data processing in the cloud
Apache License 2.0
965 stars 52 forks source link

reflow does not reject images that do not exist #111

Open hgbrian opened 5 years ago

hgbrian commented 5 years ago

If I run this script (reflow run hello)

val Main = exec(image := "NOTEXISTS", mem := GiB) (out file) {"
        echo hello world again >>{{out}}
"}

I get this output:

$ reflow run hello.rf
reflow: run ID: 1bd9f5a8
ec2cluster: 1 instances: m3.large:1 (<=$0.1/hr), total{mem:7.0GiB cpu:2 disk:2.9TiB intel_avx:2}, waiting{}, pending{}
1bd9f5a8: elapsed: 1m10s, running:1, completed: 0/1
  hello.Main:  exec notubuntu echo hello world again >>{{out}}  1m9s

and it just appears to just keep churning forever without throwing an error.

This is especially error-prone with AWS's regions.

mariusae commented 5 years ago

Oh, I think I know what's going on. Also due to AWS, we are perhaps a little too aggressive with retrying image resolution failures. I'll report back.