mrsa-sequences

Arvados sequence uploader and analyzer for MRSA project

Installation

To get started, you need to install the uploader first and then run the main.py script in uploader directory.

Download. You can download the uploader by cloning the github repository using following command:

git clone https://github.com/bio-ontology-research-group/mrsa-sequences.git

Prepare your system. You need to make sure you have Python, and the ability to install modules such as pycurl and pyopenssl. On Ubuntu 18.04, you can run:

sudo apt update
sudo apt install -y virtualenv git libcurl4-openssl-dev build-essential python3-dev libssl-dev libxml2 libxslt1-dev

Create and enter your virtualenv. Go to downloaded uploader directory and make and enter a virtualenv:

virtualenv --python python3 venv
. venv/bin/activate

Note that you will need to repeat the . venv/bin/activate step from this directory to enter your virtualenv whenever you want to use the installed tool.

Install the dependencies. Once the virtualenv is setup, install the dependencies:

pip install -r requirements.txt

Test the tool. Try running:

python uploader/main.py --help

Set Arvados API Token. Before uploading the sequence files, you need to set arvados api token value to environment variable ARVADOS_API_TOKEN. It will look something as the following:
```
export ARVADOS_API_TOKEN=2jv9346o396exampledonotuseexampledonotuseexes7j1ld
```

You can find the arvados token at current token link in your user profile menu on arvados web portal.

Usage

Run the uploader with a FASTA or FASTQ reads gzipped files and accompanying metadata file in YAML:

python uploader/main.py reads1.fastq.gz reads2.fastq.gz metadata.yaml

You can find the example files on mrsa web uploader. Here are the links to example files:

Once the sequence is uploaded, you can see the status of the job in state.json file.

bio-ontology-research-group / mrsa-sequences

readme

mrsa-sequences

Installation

Usage