NICTA / scoobi

A Scala productivity framework for Hadoop.
http://nicta.github.com/scoobi/
482 stars 97 forks source link

MR job numbering broken in HadoopMode for job plans with more than 1 layer #308

Closed ivmaykov closed 10 years ago

ivmaykov commented 10 years ago

If a computation has more than 1 layer, then the numbers assigned to MR jobs are broken. The total number of steps is correct, but the numbering starts at "1" for each layer.

For example, if there is a plan with 4 steps like so ( x -> y means y can't run until x completes):

Mscr A -> Mscr B Mscr A -> Mscr C Mscr B -> Mscr D

Scoobi will generate a plan with 3 layers: Layer 1: Mscr A Layer 2: Mscr B, Mscr C Layer 3: Mscr D

then scoobi will call Mscrs A, B, and D "Step 1 of 4", and will call Mscr C "Step 2 of 4".

The desired behavior is to call A "Step 1 of 4", B "Step 2 of 4", C "Step 3 of 4", and D "Step 4 of 4".