simroux / VirSorter

Source code of the VirSorter tool, also available as an App on CyVerse/iVirus (https://de.iplantcollaborative.org/de/)
GNU General Public License v2.0
104 stars 30 forks source link

Running wrapper script with a custom database #11

Open aelbehery opened 7 years ago

aelbehery commented 7 years ago

Hello Simon,

I can normally run Virsorter wrapper script using the default databases, but when I try to use a custom database I run into an error; please, see log files.

err.txt out.txt

The error seems to be in the translation step of the custom phage database, specifically in the translate_6frames utility of BioPerl, but I am not sure how to fix it. Do I need a different version of BioPerl. The current version of BioPerl is 1.007001.

Another related question; how can I run Virsorter using a custom phage protein database instead of phage contigs?

Your help is really appreciated!

Best,

Ali

simroux commented 7 years ago

Hi Ali, It seems like there is an error with the input fasta file (VirSorter complains of an empty sequence). Did you check the input file for any special character, empy sequence, etc ?

For the custom phage database, the option in the script is only to add additional phage to the existing database. If you want to replace the whole database, you will have to generate the different files needed by VirSorter (you can see them here: http://datacommons.cyverse.org/browse/iplant/home/shared/imicrobe/VirSorter/Data_package/Phage_gene_catalog), which are the hmm profiles, unclustered protein, and a tsv file including the annotation for all profiles and unclustered proteins.

aelbehery commented 7 years ago

Hi Simon, Thanks for your reply. I checked the input fasta file and it looks fine, but from your reply, if I understand correctly, it seems that the script only supports a single phage sequence to be added i.e. it doesn't support multifasta file, right?

simroux commented 7 years ago

VirSorter should be able to take multi-fasta as input for the custom phage database, what I wanted to stress out is that this custom phage database will be added to the "regular" one, not replace it. If your fasta file looks ok, then it might be worth trying to first add only the first sequence from this fasta file as a custom phage database, to see if the error is with the file itself, or if VirSorter "breaks" on a specific sequence in the file.

aelbehery commented 7 years ago

Thanks! Actually this is exactly what I need to do. I need to expand the default database not to replace it. I tried the the first sequence as a custom phage databse, as suggested, but the script threw the same error. Any idea what else could be wrong?