LinkedInAttic / white-elephant

Hadoop log aggregator and dashboard
Other
191 stars 63 forks source link

Parse job configurations for easier usage analytics #14

Closed wagnermarkd closed 10 years ago

wagnermarkd commented 10 years ago

This Pull Requests adds functionality to parse job configurations into an Avro Map. This is added as an additional job at the beginning of ProccessLogs. As part of this work, a "CombineDocumentFileFormat" is introduced which returns key values as <filename, document bytes>.

matthayes commented 10 years ago

Looks pretty good overall, thanks! just wondering about the num reducers equaling 0. What does this do?

wagnermarkd commented 10 years ago

I've removed the unused field. Let me know if there's anything else.