oicr-gsi / cbioportal_tools

tools for import of data and administration of the gsi cbioportal instance
GNU General Public License v3.0
0 stars 1 forks source link
release tools

Janus

Overview

Janus provides a gateway to a cBioPortal instance. It is named for the Roman god of doorways and transitions.

Janus takes as its input pipeline data and metadata, as generated at the Ontario Institute for Cancer Research. Its core (and for now, only) function is to prepare a study directory for upload, as specified in the cBioPortal documentation.

This document provides a general introduction to Janus. Further documentation is in the doc subdirectory. Janus has a changelog in CHANGELOG.md.

Prerequisites

Janus is primarily implemented in Python, with a few ancillary scripts in R and some third-party tools. Requirements:

Installation and Testing

Janus has a setup.py script and can be installed using pip:

pip install $JANUS_SOURCE_DIR

Alternatively, to install to a specific directory:

pip install --prefix $INSTALL_ROOT $JANUS_SOURCE_DIR

To run tests from the source directory, assuming all prerequisites are installed:

export PYTHONPATH=${JANUS_SOURCE_DIR}/src/lib:$PYTHONPATH
${JANUS_SOURCE_DIR}/src/test/test.py

Example study input appears in the study_input subdirectory.

Usage

Script

The main script is janus.py, which is copied to the bin subdirectory of the installation. Run janus.py --help for usage information.

Config files and schemas

Study generation requires a master config file, and a number of subsidiary config files. Config file format is CSVY, documented in config_format.md.

Config file structure may be specified using a schema, documented in schema.md.

Example config files for various pipelines are in study_input.

Data files

The config files may specify additional data and metadata files specific to a given analysis pipeline. Examples appear in the test data.

Environment variables

Human genome reference for the MUTATION_EXTENDED pipeline is set as follows, in descending order of priority:

  1. ref_fasta parameter in the pipeline config file (if any)
  2. HG38_ROOT environment variable, as set by the hg38 module
  3. HG19_ROOT environment variable, as set by the hg19 module

Code Structure

The prototype version of Janus in release 0.0.1 required modification to be ready for production. Deprecated code retained from the prototype is referred to as "legacy".

Python modules in src/lib:

Release Procedure

Potential Extension

Janus may later be extended to:

Credits

Janus prototype developed by OICR co-op students Kunal Chandan and Allan Liang.

Subsequent development by Iain Bancarz.

Copyright and License

Copyright (C) 2019, 2020 by Genome Sequence Informatics, Ontario Institute for Cancer Research.

Licensed under the GPL 3.0 license.