Open smohiudd opened 4 months ago
Putting the discovery-items config within s3://<EVENT_BUCKET>/collections/ in the following format: https://github.com/US-GHG-Center/ghgc-data/blob/add/lpdaac-dataset-scheduled-config/ingestion-data/discovery-items/scheduled/emit-ch4plume-v1-items.json will trigger the discovery and subsequent ingestion of the collection items based on the schedule attribute
mcp-prod will need a new release of airflow to include automated ingestion
aws s3 ls s3://covid-eo-data/OMNO2d_HRMDifference/
aws s3 ls s3://covid-eo-data/OMNO2d_HRM/
Update: We have decided to run these weekly instead of bi-weekly
I added the scheduled collection configs from veda-data #177 to mcp-test and mcp-production
Description
NO2 (#89) and Geoglam (#167, #173) datasets requires monthly ingestion as new assets are created. This is currently a manual process however should be automated.
veda-data-airflow
has a feature that allows scheduled ingestion by creating dataset specific DAGs. The file must still be transferred to the collection s3 bucket. A json file must be uploaded to the airflow event bucket. Here is an example json:Acceptance Criteria
no2-monthly
andno2-monthly-diff
froms3://covid-eo-data
bucket tos3://veda-data-store-staging
ands3://veda-data-store
using MWAA transfer dagno2-monthly, no2-monthly-diff
) in mwaa event bucket for staging (UAH) and production (MCP)geoglam
) in mwaa event bucket for staging (UAH) and production (MCP)