grailbio / reflow

A language and runtime for distributed, incremental data processing in the cloud
Apache License 2.0
965 stars 52 forks source link

resolve exec file and dir deps #127

Closed prasadgopal closed 3 years ago

prasadgopal commented 3 years ago

We've noticed that in some programs, exec nodes of different computations can get the same logical digest. When we fully evaluate these exec deps, their shas are unique. I am sending this out to get feedback. let me know what you think.

One of the exec gets the same logical digest for alternate=true and alternate=false, even though their inputs are different.

param (
  alternate bool
)

val dirs = make("$/dirs")
val strings = make("$/strings")

val f1 = file("/tmp/f1") // f1 contains "alice"
val f2 = file("/tmp/f2") // f2 contains "bob"

val vars = if !alternate {
  ["file": f1]
} else {
  ["file": f2]
}

func fn(f file, d dir) (out file) = {
  val names = strings.Join([n | (n, _) <- list(d)], ",")
  exec(image := "ubuntu") (out file) {"
    # {{names}}
    cp {{f}} {{out}}
   "}
}

val a = [(s, fn(f, dirs.Make(["file": f1]))) | (s, f) <- vars]

val b = [(s, fn(f, dirs.Make([s: f]))) | (s, f) <- a]

@requires(cpu := 1)
val Main = a + b