EBI-Metagenomics / emg-viral-pipeline

VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
Apache License 2.0
124 stars 16 forks source link

Different results between older and current version #60

Closed hoelzer closed 3 years ago

hoelzer commented 3 years ago

I just run the nextflow pipeline again using v0.3.0 on the ~/.nextflow/assets/EBI-Metagenomics/emg-viral-pipeline/nextflow/test/kleiner_virome_2015.fasta example data set.

Now, the pipeline only predicts few low-confidence contigs:

contig_ID       genus   subfamily       family  order
NODE_20_length_41715_cov_14831.579165           Sepvirinae      Podoviridae     Caudovirales
NODE_22_length_38841_cov_7038.120250    Teseptimavirus  Studiervirinae  Autographiviridae       Caudovirales
NODE_23_length_37379_cov_9295.445022    Teseptimavirus  Studiervirinae  Autographiviridae       Caudovirales
NODE_193_length_1739_cov_4.978622
NODE_66_length_5441_cov_4793.546417

while before w/ v0.2.0 we detected them as high-confidence contigs and also, in addition, some putative prophages. Because the predition of high-confidence contigs and prophages heavily depends on VirSorter I guess that something here changed between the versions.

hoelzer commented 3 years ago

Ah yeah, the folder

results/kleiner_virome_2015/01-viruses/virsorter/Predicted_viral_sequences/

is empty. Checking the nextflow working dir I see iss ues in the VirSorter process like

...
### Revision 0
Started at Mon Oct 18 16:11:01 2021
Out :
cp: cannot stat 'virsorter-data/Phage_gene_catalog_plus_viromes/*': No such file or directory
There are no clusters in the database, so skip the hmmsearch
...

Step 5 : /opt/conda/bin/Scripts/Step_5_get_phage_fasta-gb.pl VIRSorter virsorter >> virsorter/logs/out 2>> virsorter/logs/err

## Verify if this should have been a virome decontamination mode based on 10kb+ contigs
Cleaning the output directory
rm -r virsorter/r_0/db :
cat: virsorter-data/VirSorter_Readme.txt: No such file or directory
hoelzer commented 3 years ago

Aha, my virsorter input database is empty. Sorry, might just be my fault. I will check

hoelzer commented 3 years ago

My fault!

For some reason, I did not download the complete VirSorter database which actually did not break VIRify but the VIrSorter annotations were then just not in the set of contigs for downstream analyses. So I tell this a feature, not a bug :)