DataBiosphere / data-explorer

BSD 3-Clause "New" or "Revised" License
10 stars 6 forks source link

Data explorer

CircleCI

Overview

Data Explorer lets you explore a dataset. The code (in this repo and data-explorer-indexers repo) is dataset-agnostic. All dataset configuration happens in config files.

Examples:

Quickstart

Run local Data Explorer with the 1000 Genomes dataset:

Run local Data Explorer with a custom dataset

Architecture overview

The basic flow:

  1. Index dataset into Elasticsearch using an indexer from https://github.com/DataBiosphere/data-explorer-indexers
  2. Run the servers in this repo to display Data Explorer UI

GCP deployment:

GCP deployment architecture

For local development, an nginx reverse proxy is used to get around CORS:

Local deployment architecture

Want to try out Data Explorer for your dataset?

Here's one possible flow.

Sample file support

If your dataset includes sample files (VCF, BAM, etc), then Data Explorer will have:

Time series support

If your dataset has longitudinal data, then Data Explorer will show time-series visualizations:

Development

Updating the API using swagger-codegen

We use swagger-codegen to automatically implement the API, as defined in api/api.yaml, for the API server and the UI. Whenever the API is updated, follow these steps to update the server implementations:

One-time setup

One-time setup for Save in Terra feature

The Save in Terra feature temporarily stores data in a GCS bucket.

Testing

Every commit on a remote branch kicks off all tests on CircleCI.

API server unit tests use pytest and tox. To run locally:

virtualenv ~/virtualenv/tox
source ~/virtualenv/tox/bin/activate
pip install tox
cd api && tox -e py35

End-to-end tests use Puppeteer and jest-puppeteer. To run locally:

# Optional: ensure the elasticsearch index is clean
docker-compose up --build -d elasticsearch
curl -XDELETE localhost:9200/_all
# Start the rest of the services
docker-compose up --build
cd ui && npm test

Troubleshooting tips for end-to-end tests:

Formatting

ui/ is formatted with Prettier. husky is used to automatically format files upon commit. To fix formatting, in ui/ run npm run fix.

Python files are formatted with YAPF.