Open rmetzger opened 10 years ago
I think this might me most valuable in mode where we have a One-Job-Yarn-Session, i.e., where you do not start a YARN session first and then submit a job, but directly submit a job using a YARN executor.
In that case, the hosts could be taken directly form the generated input splits. Everything would be transparent.
There is currently no way to use YARN to submit a job without having a running session. But I agree that we should automatically detect the split locations in that case.
See here: https://groups.google.com/forum/#!topic/stratosphere-dev/k7_NTi71O0c