Priority datacube & catalogue tasks

[x] @mliukis complete re-projection code (due 4/16/2021)
[x] @mliukis to get @markf6 geojson catalogue code running and working on new directory structure, remove granule (same images run twice) duplicates from geojson (due 4/30/2021):
- [x] Allow for new directory structure
- [x] Remove duplicate optical granules
- [x] Post skipped and used granule lists to S3
- [x] Read list of granules from S3
- [x] Warning for Radar format granules when removing duplicates
- [x] Wildcard to iterate all sub-directories for nc files to build a list
- [ ] Optional: Dask parallelization to create catalogs
- [x] Write access to S3 bucket without AWS credentials is broken (contacted Evelin 4/20/2021, JPL Cloud blocker request on 4/21/2021)
[x] @mliukis to get datacube code running in batch or lambda for global production (due 5/21/2021)
- [x] Create Docker image for the code (due 04/30/2021)
- [x] Create AWS ECR repository and push Docker image (due 04/30/2021)
- [x] Configure AWS Batch (due 05/21/2021):
  - [x] JPL Cloud Help to create a role: filed Cloud-302 ticket 04/28/2021, filed Critical ticket Cloud-316 on 05/04/2021 about Cloud-302
  - [x] The role "its-live-s3-access" for Batch processing has been created in kh9 account (05/04/2021)
  - [x] Evelin is setting up ECR permissions for account hosting datacube-dev.jpl.nasa.gov EC2 instance so we are able to push Docker images (05/05/2021) Abandon this account once new EC2 instance and S3 buckets are available.
  - [x] Danny set up new EC2 instance and S3 bucket (kh9-1) in kh9 account, ran datacube test case OK
  - [x] Configure Batch example job (due 05/27/21)
  - [x] Configure Batch job for the datacube generation (due 06/04/21):
    - [x] JPL Cloud tickets: 453, 3441, 382
  - [x] Write driver script to spawn Batch jobs to create datacubes (due 06/08/2021)
- [x] Define datacube polygons (due 5/14/2021) @alex-s-gardner: Take the regional shapefile... convert polygon boundaries into regional specific projections... take min and maximum extents... round down minimums and up maximums to the nearest 100 km then find those grid points that fall within 100 by 100 km rectangles for each cube. @markf6: short version is finding all the 100 x 100 km squares (on even 100 km grid) that fit in a region in that region’s projection
[x] @mliukis to get work with ASF to write code to move production files from ASF accounts to JPL S3 bucket (due 5/7/2021)
- [x] datacube-dev role does not allow to access ASF S3 bucket, sent request to Evelin (05/05/2021)
- [x] Transferred test set of 1382 granules to the datacube-dev S3 bucket (05/11/2021)
- [x] Transferred test set of 1382 granules to the its-live-data S3 bucket, created geojson catalog file (05/13/2021)
- [x] Transferred 390339 granules to the its-live-data S3 bucket (06/30/2021)
  - [x] Fixed time stamp for all img_pair_info.acquisition_date_img[12]
  - [x] Compress all granules as compression was not used when storing fixed files to S3 bucket
[x] During data transfer from ASF, update the image acquisition times on the L8 data as this can only be added by crossing with a database external of the image data (due 05/28/2021)
- [x] Filed SA team ticket to get public access to s3://usgs-landsat (05/19/2021)
[ ] @mliukis once production files start to arrive, build and update datacubes
- [x] Implement cube updates (06/18/2021)
- [x] Add missing v_error in Optical legacy format granules (06/04/2021) (don't do it as [vx|vy].stable_shift is missing for some granules, datacube generation runs 5x slower)
- [x] Set chip_size_height.values = chip_size_width.values for optical legacy granules if chip_size_height.values are not set
- [x] Add new data variables per latest Yang's code changes (06/09/2021)
[x] @mliukis to get projection code running on AWS
- [x] Create Docker image for the code
- [ ] Optional or what is the priority? - create new issue if needed: Write code to determine which EPSG code to project each granule into

nasa-jpl / its_live_production

Priority datacube & catalogue tasks #2