Closed dieu closed 7 years ago
@piyushnarang I'm not sure that we can run this tests, because hadoop itself will raise error.
Ok, can we test it manually then to ensure it works as expected?
@piyushnarang I added tests for FileNotFound
2017-03-09 12:39:41,848 WARN [pool-47-thread-1] reducer_estimation.InputSizeReducerEstimator$ (InputSizeReducerEstimator.scala:estimateReducersWithoutRounding(34)) - InputSizeReducerEstimator unable to estimate reducers; cannot compute size of one of (usually it's memory taps or files not found):
- Hfs["TextLine[['offset', 'line']->[ALL]]"]["file.txt"]
2017-03-09 12:39:41,860 INFO [pool-47-thread-1] flow.FlowStep (BaseFlowStep.java:logInfo(834)) - [com.twitter.scalding.r...] starting step: (1/1) counts.tsv
2017-03-09 12:39:42,402 INFO [flow com.twitter.scalding.reducer_estimation.SimpleFileNotFoundJob] flow.Flow (BaseFlow.java:logInfo(1378)) - [com.twitter.scalding.r...] stopping all jobs
2017-03-09 12:39:42,403 INFO [flow com.twitter.scalding.reducer_estimation.SimpleFileNotFoundJob] flow.FlowStep (BaseFlowStep.java:logInfo(834)) - [com.twitter.scalding.r...] stopping: (1/1) counts.tsv
2017-03-09 12:39:42,408 INFO [flow com.twitter.scalding.reducer_estimation.SimpleFileNotFoundJob] flow.Flow (BaseFlow.java:logInfo(1378)) - [com.twitter.scalding.r...] stopped all jobs
2017-03-09 12:39:42,956 ERROR [ResourceManager Event Processor] resourcemanager.ResourceManager (ResourceManager.java:run(594)) - Returning, interrupted : java.lang.InterruptedException
@isnotinvain rewrote to more safer way.
@dieu before I forget, you mentioned this would break all normal HFS instances, so we need to handle that too
@isnotinvain no, we handle existing HFS instances, it why I use Try
to getSize
on tap, because cascading Hfs doesn't handle glob patterns.
@piyushnarang / @isnotinvain / @johnynek please review.
👍
@dieu should we add a test for the file not found scenario? Or are they already covered?