The DCP Upload Service provides a file staging and validation facility for the DCP. Upload Areas are created/deleted using a REST API, which is secured so only the DCP Ingestion Service may use it. Files are staged into AWS S3 using the HCA CLI, where the Upload Service then computes checksums for them. The validation service runs Docker images against files.
Is a Lambda Chalice/Connexion/Flask app that presents the Upload Service REST API.
The API is defined using an OpenAPI 2.0 Specification (Swagger) in config/upload-api.yml
.
A SQS that receives messages on S3 ObjectCreated events and then triggers the upload checksum lambda function
Is a lambda function triggered by SQS events that computes checksums for uploaded files.
Is an AWS Batch installation
<img src="images/upload_service_architecture.png" alt="Upload Service System Architecture Diagram" title="Upload Service System Architecture Diagram" width="100%" height="100%" />
Image by Sam Pierson
Check out the upload service repo:
# IMPORTANT use --recursive
git clone --recursive git@github.com:HumanCellAtlas/upload-service.git
cd upload-service
Install packages. I use virtualenv
, but you don’t have to. This is what it looks for me:
mkdir venv # I have venv/ in my global .gitignore
virtualenv --python python3.6 venv/36
source venv/36/bin/activate
pip install -r requirements-dev.txt
Do this once:
cp config/environment.dev.example config/environment.dev
Then edit as necessary.
export AWS_PROFILE=hca
source config/environment
make test
Tests may also be run offline if you have a PostgreSQL server running locally.
You must setup an upload_local
postgres database first, e.g.:
brew install postgres
# follow instructions to start postgres server
createuser
createdb upload_local
DEPLOYMENT_STAGE=local make db/migrate
To run tests offline use the local
environment:
export DEPLOYMENT_STAGE=local
source config/environment
make test
source config/environment
scripts/upload-api
export DEPLOYMENT_STAGE=dev
source config/environment
make functional-tests
Deployment is typically performed by Gitlab. The full instructions on how to deploy to each environment (i.e. integration, prod, etc.) can be found here.
To manually deploy to e.g. the staging deployment:
export enc_password="<password-used-to-encrypt-deployment-secrets>"
scripts/deploy.sh staging
UNDER CONSTRUCTION - NOTHING TO SEE HERE
validation/ami/README.md
.default
and inbound-ssh-from-hca-teams
.upload-validator-<stage>
and role upload-validator-<stage>
.scripts/batchctl.py staging setup
cd validation/docker-images/base-alpine-python36
make release