Circular visualizer for complete genomes
Input: GenBank files (.gb)
Usage:
python runAllProcess.py <output directory> <input directory (GenBank files)>
pandas
, numpy
, biopython
, tqdm
You need to add the blast+/bin and the circos/bin to your PATH. Please check below.
echo $PATH
blastn -h
circos -h
python runAllProcess.py <output directory> <input directory (GenBank files)>
Two required arguments are as follows:
python runAfterBlastProcess.py <output directory> <input directory (GenBank files)>
Two required arguments are as follows:
module load singularity
python runOnlyBlast.py <output directory> <input directory (GenBank files)> <bin_singularity directory>
Three required arguments are as follows:
Please run runAfterBlastProcesses.py explained above after finished all qsub jobs.
python runVisualize.py <output directory> <configuration file> <option; key word for output; default:"test"> <option; the minimum number of genes in each cluster; default: 1> <option; sorting column name; default: None>
Five arguments are as follows:
This file will be outputed as "RingOrder_*_df.tsv" by runAllProcess.py and runAfterBlastProcess.py. Please see ./testResult/RingOrder_aligned_df.tsv. and ./testResult/changed_setting.tsv for examples.
AccNo | Genome_size | Strand | Angle | Deviation (Aligned) | Deviation (Original) | optional |
NC_000915.1 | 1667867 | 0 | 0 | 53.311 | 47.061 | ... |
NC_014256.1 | 1673997 | 1 | 342 | 53.07 | 177.67 | ... |
... | ... | ... | ... | ... | ... | ... |
You can edit the visualization result, such as the number of genomes and the ring order, by deleting / reordering rows in this file.
python runCreateOrthologousTable.py <output directory> <input directory (GenBank files)>
Two required arguments are as follows:
It takes 10 minutes (BLASTP 8 min, other 2 min) on a standalone desktop server of 16GB memory.
cd
git clone git@github.com:tipputa/Circular-genome-visualizer.git
python ~/Circular-genome-visualizer/bin/runAllProcess.py ~/Circular-genome-visualizer/test/ ~/Circular-genome-visualizer/test/gb/
In this example, "changed_setting.tsv" is a modified configuring file, where the first row was deleted from /test/data/RingOrder_aligned_df.tsv.
python ~/Circular-genome-visualizer/bin/runVisualize.py ~/Circular-genome-visualizer/test/ ~/Circular-genome-visualizer/test/changed_setting.tsv "rm1genome" 4
"rm1genome" is a suffix for the output file (e.g. circos_rm1genome.png). The genes conserved in >= 4 genomes are visualized.