nathanmarz / dfs-datastores

Dead-simple vertical partitioning, compression, appends, and consolidation of data on a distributed filesystem.
BSD 3-Clause "New" or "Revised" License
215 stars 82 forks source link

A VersionedTap overrides mapreduce.input.fileinputformat.inputdir without taking in account the previous values #50

Open bqm opened 9 years ago

bqm commented 9 years ago

The VersionedTap's sourceConfInit method overrides mapreduce.input.fileinputformat.inputdir in:

https://github.com/nathanmarz/dfs-datastores/blob/master/dfs-datastores-cascading/src/main/java/com/backtype/cascading/tap/VersionedTap.java#L96

As a result, we can't define out of the box a cascading.tap.MultiTap as a list of VersionedTap as each VersionedTap is going to override the previous input paths.

I am not familiar with the cascading code so I am not sure what would be the best solution to this issue. Maybe a VersionedTap could add its own path to the existing list of paths instead of overriding the whole list? As a temporary solution in my code, I am using a custom MultiTap that redefines the input paths but that doesn't seem optimal.