tensorflow / ecosystem

Integration of TensorFlow with other open-source frameworks
Apache License 2.0
1.37k stars 392 forks source link

Support compression for output in MR v1. #147

Open shishaochen opened 4 years ago

shishaochen commented 4 years ago

Aligning with org.tensorflow.hadoop.io.TFRecordFileOutputFormat, enable compression in org.tensorflow.hadoop.io.TFRecordFileOutputFormatV1 as well. To activate compression in old MapReduce APIs, simply specify options as below:

-Dmapred.output.compress=true
-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec

This pull request can be considered as supplementary to #61.