kite-sdk / kite-examples

Kite SDK Examples
Apache License 2.0
99 stars 70 forks source link

CDK-534: Switch demo crunch jobs to use views. #7

Closed rdblue closed 9 years ago

rdblue commented 9 years ago

This implements the same command-line arguments as the previous version, but the getLatestPartition method is replaced with viewFromUri. The new version relativizes the URI and then uses the path's key/value pairs to create a view.

The 'latest' partition is now time-based. Before, the newest partition in the data was always used, but this wasn't correct and led to duplicate sessions when the oozie job ran with no data. Because of this, the demo-crunch tool differs slightly from the demo-oozie tool. The demo-crunch tool will process everything before the current minute without a lower bound. That ensures that running with no arguments always processes some data rather than aborting because there was no traffic in the last minute.

This depends on kite-data-core fixes for CDK-532 and CDK-536.

joey commented 9 years ago

Is this PR still needed?

rdblue commented 9 years ago

No, this was done in https://github.com/kite-sdk/kite-examples/commit/913feca20e42c6599f3a959cf82a73a253a004e8