landsat-pds / landsat_ingestor

Scripts and other artifacts for landsat data ingestion into Amazon public hosting.
Apache License 2.0
102 stars 18 forks source link

catch corrupt tarballs. move to corrupt queue #10

Closed kapadia closed 9 years ago

kapadia commented 9 years ago

@warmerdam These changes capture arbitrary errors from splitter.split (e.g. corrupt tarballs). In these cases, scenes are moved to the tarq_corrupt directory for later attempts.

warmerdam commented 9 years ago

Other than the one point, LGTM.

kapadia commented 9 years ago

@warmerdam https://github.com/landsat-pds/landsat_ingestor/commit/36b4a3a3e6d3af51bfc5aa7cc491cf95d5c3ac3e checks whether source is s3queue. If so, it moves the corrupt file to the tarq_corrupt directory. I'm a little confused as to whether this implementation correctly addresses the problem.

Do two instances of l8_process_scene.process happen when downloading from USGS? I presume the first run of l8_process_scene.process is to download the file from USGS and place it in the S3 queue, followed by a second run to split bands, tile, etc.

warmerdam commented 9 years ago

@kapadia - l8_process_run.py is used in --queue mode to get the scenes from USGS and dump them in /tarq/, then one l8_process_scene is run for each file in the queue. l8_process_run.py can invoke the l8_process_scene directly, but we don't currently use it that way.

kapadia commented 9 years ago

@warmerdam given that logic, this PR should be good to merge.