simroux / VirSorter

Source code of the VirSorter tool, also available as an App on CyVerse/iVirus (https://de.iplantcollaborative.org/de/)
GNU General Public License v2.0
104 stars 30 forks source link

"No such file or directory" error for Contigs_prots_vs_Phage_Gene_unclustered.tab, VIRSorter_affi-contigs.csv, VIRSorter_phage-signal.csv #75

Closed mshamash closed 4 years ago

mshamash commented 4 years ago

Hi,

I'm trying to install VirSorter on our HPC cluster and have encountered an issue when testing on a bacterial genome which I know has several prophages and those were detected using VS on my local machine. VS seems to run fine and finishes with exit code 0, however the VIRSorter_global-phage-signal.csv file is completely blank (0 KB) and there are no files in the Predicted_viral_sequences folder.

When I look closer at the 'err' file in the logs, I see this:

Can't open 'virsorter-out/Contigs_prots_vs_Phage_Gene_unclustered.tab' for reading: 'No such file or directory' at virsorter/Scripts/Step_2_merge_contigs_annotation.pl line 79
Can't open 'virsorter-out/VIRSorter_affi-contigs.csv' for reading: 'No such file or directory' at virsorter/Scripts/Step_3_highlight_phage_signal.pl line 64
Can't open 'virsorter-out/VIRSorter_phage-signal.csv' for reading: 'No such file or directory' at virsorter/Scripts/Step_4_summarize_phage_signal.pl line 84

stdout from the run has no errors. I should mention we're using the latest version of VS, cloned from GitHub, and latest version of the databases.

Any advice on what we can do to try and fix this?

Thank you!

Michael

EDIT: I took a closer look at stdout, and I see this message after step 1.3 and before step 2.

No file virsorter-out/r_0/Contigs_prots_vs_New_unclustered.tab, nothing new to add to virsorter-out/Contigs_prots_vs_Phage_Gene_unclustered.tab

That message doesn't appear when I run VS locally.

EDIT2: I think I narrowed it down to a problematic diamond install, testing with a new version.

simroux commented 4 years ago

Hi Michael,

"virsorter-out/Contigs_prots_vs_Phage_Gene_unclustered.tab" is the result file from the blast or diamond comparison to the phage protein sequences for which no HMM profile could be created, so you seem to be on the right path when looking at diamond install. If this file is not created at all (which it seems to be the case), then the VirSorter Step 2 script fails when trying to open it, and all following steps fail as well. When testing with new diamond install, I'd advise to run on a clean (empty) output folder just in case.

Best, Simon

mshamash commented 4 years ago

Hi Simon,

It did seem to be diamond related. Now it gives a new error saying that the vaya ase was created with an old version of diamond and needs to be updated.

I’ll see if we can install the version that’s suggested in the VS README, but if not, would there be any disincentive to updating the databases? I think I saw the command for that somewhere, I’ll have to search again.

Best,

Michael

simroux commented 4 years ago

No disincentive to regenerating the database: once you extract the files from virsorter-data-v2.tar.gz, you should have both Pool_unclustered.faa and Pool_new_unclustered.dmnd in the database folder (Phage_gene_catalog_plus_viromes and/or Phage_gene_catalog). You should be able to regenerate this diamond db with the new version of diamond (if I remember correctly, this "Pool_new_unclustered.dmnd" file is the only file required).

mshamash commented 4 years ago

I regenerated both Pool_new_unclustered.dmnd and Pool_unclustered.dmnd files for the Refseq and Viromes databases and all works properly now! Thanks for your help Simon!

simroux commented 4 years ago

Great, thanks for the update !