@johnynek and I paired to add some configuration so that we can limit the number of partitions and reducers used when running a job using scalding-spark. Our hope is that this will allow us to reduce the total size of results that are being sent back to the driver, which is currently causing us some pain.
@johnynek and I paired to add some configuration so that we can limit the number of partitions and reducers used when running a job using scalding-spark. Our hope is that this will allow us to reduce the total size of results that are being sent back to the driver, which is currently causing us some pain.