KennthShang / CHERRY_crispr_DATABASE

Database version of CHERRY CRISPR
1 stars 0 forks source link

CHERRY

CHERRY-crispr DATABASE version

This program provides an extension version of CHERRY, which uses the CRISPR information in CHERRY's database for host prediction.

The main local program is available via PhaBOX and WebServer

Table of Contents

🚀  Installation

If you have already installed phabox before, you can skip this part and directly use the phabox environment

We suggest you install all the packages using conda (both Miniconda and Anaconda are ok) following the command lines below:

conda create --name cherry_crispr_db python=3.8
conda activate cherry_crispr_db
conda install pandas numpy biopython
conda install blast -c bioconda

🚀  Quick Start

Remember to conda activate your env first

git clone https://github.com/KennthShang/CHERRY_crispr_DATABASE.git

python CHERRY_crispr_DATABASE/Cherry_crispr_db.py --infile nucl.fna --outfolder test_out/ --datasetpth CHERRY_crispr_DATABASE/dataset --ident 90 --coverage 0.9

⌛️  Usage

  --infile 
                        input fasta file
  --outfolder 
                        path to the output folder
  --datasetpth 
                        path to the CHERRY_crispr_DATABASE/dataset/
  --threads 
                        Number of threads to run the program (default 8)
  --ident
                        Identity threshold for the alignments (default 90)
  --coverage
                        Coverage threshold for the alignments (default 0.9)

The program will return the results that meet both ident & coverage thresholds.

📈  Output format

Input (provided by the user):
    1. phage contigs from their samples (FASTA files)

Output:
    1. The host range of the given phages (CSV files)
    2. RAW BLASTN alignment results (NCBI blast+)
       [In the program --ident refer to pident and --coverage refer to length/slen]

You can find the full taxonomy entry of the bacteria accession using the file in dataset/prokaryote.csv

📫  Have a question?

We are happy to hear your question on our issues page CHERRY! Obviously, if you have a private question or want to cooperate with us, you can always reach out to us directly via our email: jiayushang@cuhk.edu.hk

✏️  Citation

If you use this program, please cite the following papers:

🤵  Team

Our groupmates also provide many useful tools for bioinformatics analysis. Please check Yanni's Group for further information. Hope you will like them!