twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.49k stars 703 forks source link

Fix 1837, work around sources with scheme fields not matching typedsource fields #1838

Closed johnynek closed 6 years ago

johnynek commented 6 years ago

closes #1837

The issue is that cascading schemes have fields but so do TypedSources. Nothing requires those to be in sync, but they should be. In the case of ParquetScheme, there are not set Fields Fields.UNKNOWN. Since scalding 0.18 has an optimization that removes unneeded maps this exposes the bug we merge an unmapped parquet source with a mapped one.

johnynek commented 6 years ago

@ianoc ptal

cc @fwbrasil

ianoc commented 6 years ago

lgtm

great work getting a test going

fwbrasil commented 6 years ago

LGTM. Thank you!