kite-sdk / kite

Kite SDK
http://kitesdk.org/docs/current/
Apache License 2.0
394 stars 263 forks source link

KITE-1118: Optimize writing data to S3 #442

Closed noslowerdna closed 8 years ago

noslowerdna commented 8 years ago

Pull request for https://issues.cloudera.org/browse/KITE-1118

For importing about 20 GB of data into S3, this provided approximately an 8x job speedup, from 40 min to 5 min.

mkwhitacre commented 8 years ago

+1

rbrush commented 8 years ago

Made one suggestion above, but otherwise this looks good.

noslowerdna commented 8 years ago

@rbrush I like that suggestion, changed: https://github.com/noslowerdna/kite/commit/8b28ae7f4187871a94c919d2f8b68514b4287a2b

Thanks for the review.

rbrush commented 8 years ago

+1, looks great!

mkwhitacre commented 8 years ago

Re-kicked off the "default" profile build.

mkwhitacre commented 8 years ago

Build passed. Merging.