simroux / VirSorter

Source code of the VirSorter tool, also available as an App on CyVerse/iVirus (https://de.iplantcollaborative.org/de/)
GNU General Public License v2.0
104 stars 30 forks source link

Step 1 failed: VIRSorter_prots.fasta and VIRSorter_nett_filtered.fasta were not found #57

Open asttra opened 4 years ago

asttra commented 4 years ago

I was wondering if I could get some assistance with this. What does it mean? Am I writing the command incorrectly? Is there something additional that I need to download? I installed VirSorter using Anaconda.

Command Input: (virsorter) bash-4.2$ wrapper_phage_contigs_sorter_iPlant.pl -f /projects/nanopore-working/niki/archaea/data/Sulfolobus/ERR1351179.fasta --db 1 --wdir /projects/nanopore-working/niki/archaea/results/VirSorter/Sulfolobus/ --ncpu 4 --data-dir /projects/nanopore-working/niki/bin/virsorter-data

Output: Bin: /projects/nanopore-working/niki/bin/anaconda3/envs/virsorter/bin Dataset: VIRSorter Input file: /projects/nanopore-working/niki/archaea/data/Sulfolobus/ERR1351179.fasta Db: 1 Working dir: /projects/nanopore-working/niki/archaea/results/VirSorter/Sulfolobus/ Custom phages : Data dir: /projects/nanopore-working/niki/bin/virsorter-data Num CPUs: 4 blastp: blastp

Working directory already present: "/projects/nanopore-working/niki/archaea/results/VirSorter/Sulfolobus/".

If this contains an aborted run, the script will terminate!

Step 1 failed, we stop there: either /projects/nanopore-working/niki/archaea/results/VirSorter/Sulfolobus/fasta/VIRSorter_prots.fasta or /projects/nanopore-working/niki/archaea/results/VirSorter/Sulfolobus/fasta/VIRSorter_nett_filtered.fasta were not found

asttra commented 4 years ago

Nevermind. I realized that this error was thrown because of an aborted run. FYI for anyone out there-- the input files must be fasta files (not fastq).

JiqiuWu commented 4 years ago

I had this error too, I use fasta file. Dose anyone know how to fix it?

simroux commented 4 years ago

The first step here would be to delete the output folder (the path provided to the "wdir" argument) and try again, usually this error is thrown because the output folder exists with a partial VirSorter run.

JiqiuWu commented 4 years ago

I deleted the output folder with the code rm -r test_virsorter/

but it didn't work, and got this feedback wrapper_phage_contigs_sorter_iPlant.pl -f ~/jiqiuwu/data/test_vir.fasta --db 1 --wdir ~/jiqiuwu/data/test_virsorter/ --ncpu 4 --data-dir ~/jiqiuwu/softwares/virsorter-data Bin : /rdsgpfs/general/user/hz4918/home/anaconda3/envs/virsorter/bin Dataset : VIRSorter Input file : /rds/general/user/hz4918/home/jiqiuwu/data/test_vir.fasta Db : 1 Working dir : /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/ Custom phages : Data dir : /rds/general/user/hz4918/home/jiqiuwu/softwares/virsorter-data Num CPUs : 4 blastp : blastp

Started at Wed Apr 22 10:33:55 2020 Step 0.5 : /rdsgpfs/general/user/hz4918/home/anaconda3/envs/virsorter/bin/Scripts/Step_1_contigs_cleaning_and_gene_prediction.pl VIRSorter /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/input_sequences.fna 2 >> /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/logs/out 2>> /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/logs/err

Step 1 failed, we stop there: either /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/VIRSorter_prots.fasta or /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/VIRSorter_nett_filtered.fasta were not found

simroux commented 4 years ago

Ok, so Step 1 failed even with a fresh directory. Can you list the content of "/rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/" ?

mingy64 commented 4 years ago

Hello, simroux

I have a exactly same error too. 'input_sequences.fna' and 'input_sequences_id_translation.tsv' files were creating in the output folder. (in the "JiqiuWu" occasion : /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/ folder) I use a fasta file(assembled contigs) as well. And even i tried to re formatting FASTA using anvi'o. but it was same error too..

Thank you for your help. Sincerely

simroux commented 4 years ago

Could you list the content of your /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/ folder, and specifically, is there a "_prots.fasta" file ? If not, then it is most likely an issue with the install of metageneannotator.

JiqiuWu commented 4 years ago

Hi Simon,

I am so sorry for the delay.

input_sequences.fna and input_sequences_id_translation.tsv are my /rds/general/user/hz4918/home/jiqiuwu/data/test_virsorter/fasta/ folder, nothing like _prots.fasta.

I just used these commands to install metageneannotator:

cd ~/miniconda/envs/virsorter/bin
wget http://metagene.nig.ac.jp/metagene/mga_x86_64.tar.gz
tar -xvzf mga_x86_64.tar.gz

Is there anything wrong?

Many thanks, Jiqiu

JiqiuWu commented 4 years ago

Hi Simon,

I also installed metageneannotator by conda install --name virsorter -c bioconda metagene_annotator unfortunately, i got the same error and the same output.

Many thanks, Jiqiu

simroux commented 4 years ago

Hi Jiqiu,

Can you check what is the content of the error log ? (should be an "err" file in "logs" directory)

Best, Simon

JiqiuWu commented 4 years ago

Hi Simon,

Yes, I got this in the error file

Can't locate Bio/Seq.pm in @INC (you may need to install the Bio::Seq module) (@INC contains: /public/home/jiqiu/.conda/envs/virsorter/lib/site_perl/5.26.2/x86_64-linux-thread-multi /public/home/jiqiu/.conda/envs/virsorter/lib/site_perl/5.26.2 /public/home/jiqiu/.conda/envs/virsorter/lib/5.26.2/x86_64-linux-thread-multi /public/home/jiqiu/.conda/envs/virsorter/lib/5.26.2 .) at /public/home/jiqiu/.conda/envs/virsorter/bin/Scripts/Step_1_contigs_cleaning_and_gene_prediction.pl line 5.
BEGIN failed--compilation aborted at /public/home/jiqiu/.conda/envs/virsorter/bin/Scripts/Step_1_contigs_cleaning_and_gene_prediction.pl line 5.

The exact path has changed, because i had a new job, but it doesn't matter.

Many thanks, Jiqiu

simroux commented 4 years ago

Oh ! So that's the same issue as https://github.com/simroux/VirSorter/issues/74 , which can be apparently solved ? Bottom line is: something went wrong with the conda installation, and the script does not find the Bio::Seq module. You can also take a look at https://github.com/simroux/VirSorter/issues/71

JiqiuWu commented 4 years ago

Thank you so much! I copied the Bio folder to the following path by

cp -r ~/.conda/envs/virsorter/lib/perl5/site_perl/5.22.0/Bio/ ~/.conda/envs/virsorter/lib/site_perl/5.26.2/x86_64-linux-thread-multi/

And then VirSorter worked.

But i got an output like this:

## Verify if this should have been a virome decontamination mode based on 10kb+ contigs
## -> No, this looks fine
Cleaning the output directory
rm -r /public/home/jiqiu/data/test/virsorter_604/r_0/db :

and something in the err file

[mclIO] writing </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci>
.......................................
[mclIO] wrote native interchange 11822x11822 matrix with 12604 entries to stream </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci>
[mclIO] wrote 11822 tab entries to stream </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.tab>
[mcxload] tab has 11822 entries
[mclIO] reading </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci>
.......................................
[mclIO] read native interchange 11822x11822 matrix with 12604 entries
[mcl] pid 19013
 ite   chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
  1     0.72  0.14 1.00/0.74/1.25 1.01 1.01 1.01   0
  2     0.63  0.11 1.00/0.75/1.27 1.00 1.00 1.02   0
  3     0.59  0.09 1.00/0.63/1.00 1.00 0.99 1.01   0
  4     0.70  0.08 1.00/0.72/1.00 1.00 0.98 0.99   0
  5     0.67  0.10 1.00/0.73/1.00 1.00 0.99 0.98   0
  6     0.69  0.08 1.00/0.73/1.00 1.00 0.99 0.97   0
  7     0.72  0.07 1.00/0.72/1.00 1.00 1.00 0.97   0
  8     0.72  0.10 1.00/0.70/1.00 1.00 1.00 0.97   0
  9     0.69  0.07 1.00/0.73/1.00 1.00 1.00 0.97   0
 10     0.37  0.07 1.00/0.76/1.00 1.00 1.00 0.97   0
 11     0.25  0.10 1.00/0.76/1.00 1.00 1.00 0.96   0
 12     0.25  0.06 1.00/0.76/1.00 1.00 1.00 0.96   0
 13     0.25  0.08 1.00/0.76/1.00 1.00 1.00 0.96   0
 14     0.25  0.10 1.00/0.76/1.00 1.00 1.00 0.96   0
 15     0.25  0.09 1.00/0.76/1.00 1.00 1.00 0.95   0
 16     0.25  0.07 1.00/0.76/1.00 1.00 1.00 0.95   0
 17     0.25  0.09 1.00/0.76/1.00 1.00 1.00 0.95   0
 18     0.23  0.09 1.00/0.77/1.00 1.00 1.00 0.95   0
 19     0.09  0.11 1.00/0.91/1.00 1.00 1.00 0.95   0
 20     0.01  0.10 1.00/0.99/1.00 1.00 1.00 0.95   0
 21     0.00  0.09 1.00/1.00/1.00 1.00 1.00 0.95   0
[mcl] jury pruning marks: <100,99,99>, out of 100
[mcl] jury pruning synopsis: <99.6 or perfect> (cf -scheme, -do log)
[mcl] output is in /public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.csv
[mcl] 5908 clusters found
[mcl] output is in /public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.csv

Please cite:
    Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
    University of Utrecht, May 2000.
       (  http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf
       or  http://micans.org/mcl/lit/svdthesis.pdf.gz)
OR
    Stijn van Dongen, A cluster algorithm for graphs. Technical
    Report INS-R0010, National Research Institute for Mathematics
    and Computer Science in the Netherlands, Amsterdam, May 2000.
       (  http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z
       or  http://micans.org/mcl/lit/INS-R0010.ps.Z)

It's fine, right? And can i go ahead and run more data?

Many thanks, Jiqiu

simroux commented 4 years ago

Yes, these are not real errors, it's all good !

On Wed, Jun 3, 2020, 20:02 JiqiuWu notifications@github.com wrote:

Thank you so much! I copied the Bio folder to the following path by

cp -r ~/.conda/envs/virsorter/lib/perl5/site_perl/5.22.0/Bio/ ~/.conda/envs/virsorter/lib/site_perl/5.26.2/x86_64-linux-thread-multi/

And then VirSorter worked.

But i got an output like this:

Verify if this should have been a virome decontamination mode based on 10kb+ contigs

-> No, this looks fine

Cleaning the output directory rm -r /public/home/jiqiu/data/test/virsorter_604/r_0/db :

and something in the err file

[mclIO] writing </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci> ....................................... [mclIO] wrote native interchange 11822x11822 matrix with 12604 entries to stream </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci> [mclIO] wrote 11822 tab entries to stream </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.tab> [mcxload] tab has 11822 entries [mclIO] reading </public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.mci> ....................................... [mclIO] read native interchange 11822x11822 matrix with 12604 entries [mcl] pid 19013 ite chaos time hom(avg,lo,hi) m-ie m-ex i-ex fmv 1 0.72 0.14 1.00/0.74/1.25 1.01 1.01 1.01 0 2 0.63 0.11 1.00/0.75/1.27 1.00 1.00 1.02 0 3 0.59 0.09 1.00/0.63/1.00 1.00 0.99 1.01 0 4 0.70 0.08 1.00/0.72/1.00 1.00 0.98 0.99 0 5 0.67 0.10 1.00/0.73/1.00 1.00 0.99 0.98 0 6 0.69 0.08 1.00/0.73/1.00 1.00 0.99 0.97 0 7 0.72 0.07 1.00/0.72/1.00 1.00 1.00 0.97 0 8 0.72 0.10 1.00/0.70/1.00 1.00 1.00 0.97 0 9 0.69 0.07 1.00/0.73/1.00 1.00 1.00 0.97 0 10 0.37 0.07 1.00/0.76/1.00 1.00 1.00 0.97 0 11 0.25 0.10 1.00/0.76/1.00 1.00 1.00 0.96 0 12 0.25 0.06 1.00/0.76/1.00 1.00 1.00 0.96 0 13 0.25 0.08 1.00/0.76/1.00 1.00 1.00 0.96 0 14 0.25 0.10 1.00/0.76/1.00 1.00 1.00 0.96 0 15 0.25 0.09 1.00/0.76/1.00 1.00 1.00 0.95 0 16 0.25 0.07 1.00/0.76/1.00 1.00 1.00 0.95 0 17 0.25 0.09 1.00/0.76/1.00 1.00 1.00 0.95 0 18 0.23 0.09 1.00/0.77/1.00 1.00 1.00 0.95 0 19 0.09 0.11 1.00/0.91/1.00 1.00 1.00 0.95 0 20 0.01 0.10 1.00/0.99/1.00 1.00 1.00 0.95 0 21 0.00 0.09 1.00/1.00/1.00 1.00 1.00 0.95 0 [mcl] jury pruning marks: <100,99,99>, out of 100 [mcl] jury pruning synopsis: <99.6 or perfect> (cf -scheme, -do log) [mcl] output is in /public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.csv [mcl] 5908 clusters found [mcl] output is in /public/home/jiqiu/data/test/virsorter_604/r_1/new_clusters.csv

Please cite: Stijn van Dongen, Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht, May 2000. ( http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf or http://micans.org/mcl/lit/svdthesis.pdf.gz) OR Stijn van Dongen, A cluster algorithm for graphs. Technical Report INS-R0010, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. ( http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z or http://micans.org/mcl/lit/INS-R0010.ps.Z)

It's fine, right? And can i go ahead and run more data?

Many thanks, Jiqiu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/simroux/VirSorter/issues/57#issuecomment-638573658, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACD2JSKCK2XTGXZ5FHBCZ4TRU4FF7ANCNFSM4JQJUH2A .

JiqiuWu commented 4 years ago

Thank you so much, Simon! Not only for VirSorter, but also other helpful tools on virome! Many many thanks!

ereyred commented 3 years ago

Hello,

I have the same error (step 1, line 5) and I copied the _~/lib/perl5/siteperl/5.22.0/Bio/ folder into _~/lib/site_perl/5.26.2/x8664-linux-thread-multi/ and Bio:Seq seems to be fine, no error message when I enter perl -e "use Bio::Seq;" And I'm using a fasta input file.

Any idea what the problem could be? Thank you Simon!

simroux commented 3 years ago

Hi,

Which error are you referring to ? (there are a few different ones in this thread). Note that the original error was due to an aborted run, which led to the output folder already existing. The easiest way to solve this is to remove the output directory, then rerun VirSorter.

Best, Simon