simroux / VirSorter

Source code of the VirSorter tool, also available as an App on CyVerse/iVirus (https://de.iplantcollaborative.org/de/)
GNU General Public License v2.0
104 stars 30 forks source link

Custom Phage Sequence #49

Open ZarulHanifah opened 4 years ago

ZarulHanifah commented 4 years ago

Hello Virsorter developers,

I tried Virsorter sample data with success. However, there is a problem when I add custom phage (--cp). The command was:

wrapper_phage_contigs_sorter_iPlant.pl -f top100.fasta  \
--cp FINAL_Gut_Viral_Database_GVD_1.7.2018.fna \
--wdir top100.2-out \
--ncpu 6 \
--data-dir virsorter-data

Here is a part of the stdout which I think might be relevant to this issue:

Started at Mon Sep  9 17:01:55 2019
Step 0.9 : /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/hmmsearch --tblout top100.2-out/Contigs_prots_vs_PFAMb.tab --cpu 6 -o top100.2-out/Contigs_prots_vs_PFAMb.out --noali virsorter-data/PFAM_27/Pfam-B.hmm top100.2-out/fasta/VIRSorter_prots.fasta >> top100.2-out/logs/out 2>> top100.2-out/logs/err

### Revision 0
Started at Mon Sep  9 17:07:52 2019
Out : 
Adding custom phage to the database : 
/home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/Scripts/Step_first_add_custom_phage_sequence.pl FINAL_Gut_Viral_Database_GVD_1.7.2018.fna virsorter-data/Phage_gene_catalog/ top100.2-out/r_0/db 6 >> top100.2-out/logs/out 2>> top100.2-out/logs/err

There are no clusters in the database, so skip the hmmsearch

Started at Mon Sep  9 17:09:47 2019

Step 1.3 : /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/blastp -query top100.2-out/fasta/VIRSorter_prots.fasta -db top100.2-out/r_0/db/Pool_new_unclustered -out top100.2-out/r_0/Contigs_prots_vs_New_unclustered.tab -num_threads 6 -outfmt 6 -evalue 0.001 >> top100.2-out/logs/out 2>> top100.2-out/logs/err

Here is the err log file:


------------- EXCEPTION -------------
MSG: object of class  does not implement Bio::AnnotationCollectionI. Too bad.
STACK Bio::Seq::annotation /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/lib/site_perl/5.26.2/x86_64-linux-thread-multi/Bio/Seq.pm:945
STACK Bio::Seq::new /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/lib/site_perl/5.26.2/x86_64-linux-thread-multi/Bio/Seq.pm:516
STACK toplevel /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/Scripts/Step_first_add_custom_phage_sequence.pl:163
-------------------------------------

BLAST Database error: No alias or index file found for protein database [top100.2-out/r_0/db/Pool_new_unclustered] in search path [/home/zarul/Zarul/Matthew::]
Can't open 'top100.2-out/Contigs_prots_vs_Phage_Gene_Catalog.tab' for reading: 'No such file or directory' at /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/Scripts/Step_2_merge_contigs_annotation.pl line 103
Can't open 'top100.2-out/VIRSorter_affi-contigs.csv' for reading: 'No such file or directory' at /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/Scripts/Step_3_highlight_phage_signal.pl line 59
Can't open 'top100.2-out/VIRSorter_phage-signal.csv' for reading: 'No such file or directory' at /home/zarul/Zarul/Software/anaconda3/envs/vcontact2/bin/Scripts/Step_4_summarize_phage_signal.pl line 83

I have read an issue mentioning how to use --cp (https://github.com/simroux/VirSorter/issues/43), so the way I used the command should be correct, but it didn't work. Can you please help me with this?

Thank you.

simroux commented 4 years ago

Hi,

This is weird, it seems to be a BioPerl error, but I have never seen it. Could you try to create a new bioperl sequence in a separate script ? This would look something like this:

!/usr/bin/env perl

use strict; use Bio::Seq; my $seq_bio = Bio::Seq->new(id=>"dummyid",-seq =>"ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG",-alphabet => 'dna' ); my @seqs = Bio::SeqUtils->translate_6frames($seq_bio);