EBI-Metagenomics / emg-viral-pipeline

VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
Apache License 2.0
119 stars 16 forks source link

VirFinder w/ a more specific model for eukaryotic viruses #4

Closed hoelzer closed 4 years ago

hoelzer commented 4 years ago

Currently, we are using the default model for VirFinder predictions.

However, we are particularly interested in predicting also eukaryotic viruses (and not only phages) with VirFinder. I tested the prediction using a specific model and implemented this in the nextflow version of the pipeline: https://github.com/hoelzer/virify/issues/21

Basically, the model needs to be downloaded (or deposited somewhere):

wget https://github.com/jessieren/VirFinder/raw/master/EPV/VF.modEPV_k8.rda

and then I am using a simplified version of a script from Guillermo:

run_virfinder_modEPV.Rscript VF.modEPV_k8.rda ${fasta} .
awk '{print $1"\t"$2"\t"$3"\t"$4}' ${name}*.txt > ${name}.txt

The script can be found here: https://github.com/hoelzer/virify/tree/master/bin

I just introduced the awk filter because the resulting txt file has additional columns in comparison to what the pipeline is currently expecting in the next parse step and to avoid any problems here.

I think what needs to be done is:

mberacochea commented 4 years ago

Solved in https://github.com/EBI-Metagenomics/emg-viral-pipeline/commit/c425112d3c99b2c1c80b3b49db08c32aa1f0b43f