twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.49k stars 703 forks source link

Simplify AsyncFlowDefRunner #1843

Closed johnynek closed 6 years ago

johnynek commented 6 years ago

AsyncFlowDefRunner was factored out of Execution to support multiple backends (such as the new scala-only MemoryBackend as well as the nascent spark backend).

When doing this, we overlooked that the Mode of an execution is constant for the entire Execution. It is very hard to imagine ever changing that since when you forceToDiskExecution if you were to subsequently switch modes, how can you guarantee you can read the data in the new Mode? In any case, we don't have an API or any support for changing modes as an Execution runs currently.

Using this fact, we can simplify AsyncFlowDefRunner. Along the way I found two opportunities for bugs, although I doubt they were manifest anywhere.

Lastly, I split CascadingMode, the subtype of Mode which AsyncFlowDefRunner uses, into a separate file to make it easier to locate.

johnynek commented 6 years ago

@ianoc can you take a look?

johnynek commented 6 years ago

cc @fwbrasil @dieu @ttim @pankajroark

Note, I expect this to be source compatible with previous releases since users should never know or have any way to even access the AsyncFlowDefRunner.

fwbrasil commented 6 years ago

Could you give some context on the two potential bugs you've fixed?

fwbrasil commented 6 years ago

btw, I couldn't find references to AsyncFlowDefRunner in our code base

johnynek commented 6 years ago

AsyncFlowDefRunner was introduced in 0.18