Cellxgene Gateway allows you to use the Cellxgene Server provided by the Chan Zuckerberg Institute (https://github.com/chanzuckerberg/cellxgene) with multiple datasets. It displays an index of available h5ad (anndata) files. When a user clicks on a file name, it launches a Cellxgene Server instance that loads that particular data file and once it is available proxies requests to that server.
$ python --version
python -m venv .cellxgene-gateway
source .cellxgene-gateway/bin/activate # type `deactivate` to deactivate the venv
pip install git+https://github.com/Novartis/cellxgene-gateway
Note: you may need to downgrade h5py with pip install h5py==2.9.0
due to an issue in a dependency.
pip install cellxgene-gateway
mkdir ../cellxgene_data
wget https://raw.githubusercontent.com/chanzuckerberg/cellxgene/master/example-dataset/pbmc3k.h5ad -O ../cellxgene_data/pbmc3k.h5ad
export CELLXGENE_DATA=../cellxgene_data # change this directory if you put data in a different place.
export CELLXGENE_LOCATION=`which cellxgene`
cellxgene-gateway
Here's what the environment variables mean:
CELLXGENE_LOCATION
- the location of the cellxgene executable, e.g. ~/anaconda2/envs/cellxgene/bin/cellxgene
At least one of the following is required:
CELLXGENE_DATA
- a directory that can contain subdirectories with .h5ad
data files, without trailing slash, e.g. /mnt/cellxgene_data
CELLXGENE_BUCKET
- an s3 bucket that can contain keys with .h5ad
data files, e.g. my-cellxgene-data-bucket
Cellxgene Gateway is designed to make it easy to add additional data sources, please see the source code for gateway.py and the ItemSource interface in items/item_source.pyOptional environment variables:
CELLXGENE_ARGS
- catch-all variable that can be used to pass additional command line args to cellxgene serverEXTERNAL_HOST
- the hostname and port from the perspective of the web browser, typically localhost:5005
if running locally. Defaults to "localhost:{GATEWAY_PORT}"EXTERNAL_PROTOCOL
- typically http when running locally, can be https when deployed if the gateway is behind a load balancer or reverse proxy that performs https termination. Default value "http"GATEWAY_IP
- ip addess of instance gateway is running on, mostly used to display SSH instructions. Defaults to socket.gethostbyname(socket.gethostname())
GATEWAY_PORT
- local port that the gateway should bind to, defaults to 5005GATEWAY_EXPIRE_SECONDS
- time in seconds that a cellxgene process will remain idle before being terminated. Defaults to 3600 (one hour)GATEWAY_EXTRA_SCRIPTS
- JSON array of script paths, will be embedded into each page and forwarded with --scripts
to cellxgene serverGATEWAY_ENABLE_ANNOTATIONS
- Set to true
or to 1
to enable cellxgene annotations and gene sets.GATEWAY_ENABLE_BACKED_MODE
- Set to true
or to 1
to load AnnData in file-backed mode. This saves memory and speeds up launch time but may reduce overall performance.GATEWAY_LOG_LEVEL
- default is INFO
. set to DEBUG
to increase logging and to WARNING
to decrease logging.S3_ENABLE_LISTINGS_CACHE
- Set to true
or to 1
to cache listings of S3 folders for performance. If the cache becomes stale, set filecrawl.html?refresh=true
query parameter to refresh the cache.If any of the following optional variables are set, ProxyFix will be used.
PROXY_FIX_FOR
- Number of upstream proxies setting X-Forwarded-ForPROXY_FIX_PROTO
- Number of upstream proxies setting X-Forwarded-ProtoPROXY_FIX_HOST
- Number of upstream proxies setting X-Forwarded-HostPROXY_FIX_PORT
- Number of upstream proxies setting X-Forwarded-PortPROXY_FIX_PREFIX
- Number of upstream proxies setting X-Forwarded-PrefixThe defaults should be fine if you set up a venv and cellxgene_data folder as above.
First, build Docker image:
docker build -t cellxgene-gateway .
Then, cellxgene-gateway can be launched as such:
docker run -it --rm \
-v <local_data_dir>:/cellxgene-data \
-p 5005:5005 \
cellxgene-gateway
Additional environment variables can be provided with the -e
parameter:
docker run -it --rm \
-v ../cellxgene_data:/cellxgene-data \
-e GATEWAY_PORT=8080 \
-p 8080:8080 \
cellxgene-gateway
The current paradigm for customization is to modify files during a build or deployment phase:
Currently we use a bash script that copies the gateway to a "build" directory before modifying templates with sed and the like. There is probably a better way.
We’re actively developing. Please see the "future work" section of the wiki. If you’re interested in being a contributor please reach out to @alokito.
If you want to develop the code, you will need to clone the repo. Make sure you have the prequesite listed above, then:
git clone https://github.com/Novartis/cellxgene-gateway.git
cd cellxgene-gateway
pip install -r requirements.txt
python setup.py develop
For convenience, the code repo includes a run.sh.example
shell script to run the gateway.
conda install -c conda-forge pre-commit
pre-commit install
python -m unittest discover tests
coverage run -m unittest discover tests
coverage html
pip install isort flake8 black
isort -rc . # rc means recursive, and was deprecated in dev version of isort
black .
If you need help for any reason, please make a github ticket. One of the contributors should help you out.
Make sure your .pypirc
is set up for testpypi and pypi index servers.
rm -rf dist
python setup.py sdist bdist_wheel
python -m twine upload --repository testpypi dist/*
python -m twine upload dist/*