datasalt / pangool

Tuple MapReduce for Hadoop: Hadoop API made easy
http://datasalt.github.io/pangool/
Apache License 2.0
57 stars 13 forks source link

Enhance multipleInputs to allow more than one mapping for the same input path #41

Closed pereferrera closed 9 years ago

pereferrera commented 9 years ago

An example use case is a Parquet file where we select different sets of columns and assign each set to a different Mapper. Right now, Pangool overrides the input assignment for the same Path, silently keeping only the last one. A small change is needed for Pangool to be able to delegate to multiple mappers and input formats for the same Path.