hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
977 stars 244 forks source link

[hailtop.batch] Batch.run() does not work if executed more than once with temporary files #14177

Open jigold opened 9 months ago

jigold commented 9 months ago

What happened?

Lindo tried to use JobResourceFiles a second time after updating the original batch, but got FileNotFoundError. This is because the default behavior is to delete temporary files with b.run(). We should add this use case to our documentation, but it might also be a good idea to eagerly catch these errors if possible and provide a better error message. I don't think we can change the default value at this point.

https://hail.zulipchat.com/#narrow/stream/223457-Hail-Batch-support/topic/File.20dependency.20error/near/416647170

Version

0.2.127

Relevant log output

No response

ehigham commented 9 months ago

triage - @danking to follow up. 2 issues perhaps:

  1. Why did we get this?
  2. Can't have intermediates shared across run invocations?
danking commented 7 months ago

Sayonara!