NASA-IMPACT / veda-data-airflow

Airflow implementation of ingest pipeline for VEDA STAC data
7 stars 2 forks source link


This repo houses function code and deployment code for producing cloud-optimized data products and STAC metadata for interfaces such as

Project layout

Fetching Submodules

First time setting up the repo: git submodule update --init --recursive

Afterwards: git submodule update --recursive --remote



See get-docker


See terraform-getting-started


See getting-started-install


This project uses Terraform modules to deploy Apache Airflow and related AWS resources using Amazon's managed Airflow provider.

Make sure that environment variables are set

.env.example` contains the environment variables which are necessary to deploy. Copy this file and update its contents with actual values. The deploy script will source and use this file during deployment when provided through the command line:

# Copy .env.example to a new file
$cp .env.example .env
# Fill values for the environments variables

# Init terraform modules
$bash ./scripts/ .env <<< init

# Deploy
$bash ./scripts/ .env <<< deploy

Note: Be careful not to check in .env (or whatever you called your env file) when committing work.

Currently, the client id and domain of an existing Cognito user pool programmatic client must be supplied in configuration as VEDA_CLIENT_ID and VEDA_COGNITO_DOMAIN (the veda-auth project can be used to deploy a Cognito user pool and client). To dispense auth tokens via the workflows API swagger docs, an administrator must add the ingest API lambda URL to the allowed callbacks of the Cognito client.

Gitflow Model

VEDA pipeline gitflow


This project is licensed under Apache 2, see the LICENSE file for more details.