EBI-Metagenomics / emg-viral-pipeline

VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
Apache License 2.0
124 stars 16 forks source link

VirSorter predicts prophage larger than contig size #6

Closed hoelzer closed 3 years ago

hoelzer commented 4 years ago

Test assembly:

kleiner_2015.fasta.gz

Observation

It seems that VirSorter predicts a prophage in a range that is actually larger than the contig size. Example:

>NODE_51_length_63443_cov_50.479870

So contig NODE_51 (seq51 after renaming) has a length of 63443 nt.

Now VirSorter predicts a prophage for this contig from position 19922-63493:

(base) [mhoelzer@hh-yoda-11-01 ~]$ grep seq51 /hps/nobackup2/metagenomics/mhoelzer/nextflow-results/virify/v0.1/kleiner_2015/kleiner_2015/01-viruses/virsorter/Predicted_viral_sequences/VIRSorter_prophages_cat-4.fasta 
>VIRSorter_seq51_gene_20_gene_72-19922-63493-cat_4

So the predicted prophage's stop position is larger than the actual contig size. I have the feeling this is a VirSorter problem or maybe a wanted feature. Maybe we should restrict the predicted prophage stop to the length of the contig.

@mberacochea will add some checks and print a message to let us know.

hoelzer commented 4 years ago

See VirSorter answer here: https://github.com/simroux/VirSorter/issues/68

So we can wait for a fix or simply adjust the length to the actual contig size in case the reported length is too long.

hoelzer commented 4 years ago

the problem is fixed already in the recent github version of VirSorter: https://github.com/simroux/VirSorter/issues/68

mberacochea commented 3 years ago

Pending update of container with latest release of virsorter (https://github.com/simroux/VirSorter/releases/tag/v1.0.6)

hoelzer commented 3 years ago

Maybe helps to build a Docker from the 1.0.6 VirSorter branch: https://github.com/replikation/What_the_Phage/blob/master/phage-tool-Dockerfiles/virsorter/Dockerfile

Or this should also work:

FROM continuumio/miniconda3
ENV VERSION 1.0.6
ENV TOOL virsorter

RUN apt update && apt install -y procps wget gzip pigz bc && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN conda config --add channels conda-forge && \
        conda config --add channels bioconda && \
        conda config --add channels default

RUN conda install $TOOL=$VERSION && conda clean -a

based on the bioconda recipe.

Or that container also works for me in the dev branch :)

docker pull mhoelzer/virsorter:1.0.6

mberacochea commented 3 years ago

Thanks! I got it to work now using 1.0.6.

Closing this ticket :+1: