soedinglab / spacepharer

SpacePHARER CRISPR Spacer Phage-Host pAiRs findER
https://spacepharer.soedinglab.org
GNU General Public License v3.0
34 stars 4 forks source link

Example with CRISPR output files (piler-cr and CRISPRDetect) produces empty file #12

Open shaman-narayanasamy opened 7 months ago

shaman-narayanasamy commented 7 months ago

Dear authors/developers,

Please find the relevant information for my issue below. Please do not hesitate to ask for more informaiton.

Looking forward to hearing from you.

Expected Behavior

Non-empty file with output file to be produced, similar to that of a regular run.

Current Behavior

Empty output file produced.

Steps to Reproduce (for bugs)

$ rm -rf tmpFolder
$ mkdir tmpFolder
$ spacepharer easy-predict examples/crisprdetect_test examples/pilercr_test output/targetSetDB predictions.tsv tmpFolder

Spacepharer Output (for bugs)

The output file is empty. spacepharer worked when applied to the fasta format spacers. Here is the stdout of the run:

predictions.tsv exists and will be overwritten
easy-predict examples/crisprdetect_test examples/pilercr_test output/targetSetDB predictions.tsv tmpFolder 

MMseqs Version:                         5.c2e680a
Taxonomy mapping file                   
NCBI tax dump directory                 
Substitution matrix                     nucl:nucleotide.out,aa:VTML40.out
<< Skipped for brevity >>
[=================================================================] 100.00% 2 0s 1ms
Time for merging to predictions.tsv: 0h 0m 1s 512ms
Time for processing: 0h 0m 2s 633ms

Context

I also tested using only piler-cr results such that I make sure there is only one set of CRISPR results being evaluated. Same output.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

RuoshiZhang commented 7 months ago

Hi! The different test files in the example folder came from different bacteria genomes, only the one from fasta_test is supposed to get a hit from one of the example phage genomes. You could try searching against a larger target database (for instance spacepharer downloaddb GenBank_phage_2018_09 targetSetDB tmpFolder). Hope this answers your question.

shaman-narayanasamy commented 7 months ago

Hi!

Thanks for the response! Here is what I did:

$ mkdir -p database # Start from a fresh output directory
$ mkdir -p tmpFolder # Create a fresh tmp folder
$ $ spacepharer downloaddb GenBank_phage_2018_09 database/targetSetDB tmpFolder/
downloaddb GenBank_phage_2018_09 database/targetSetDB tmpFolder/ 

MMseqs Version:         5.c2e680a
Create reversed setdb   1
Threads                 40
Verbosity               3

2024-02-24 14:16:09 URL:https://wwwuser.gwdg.de/~compbiol/spacepharer/2018_09/genbank_phages_2018_09.tar [144250880/144250880] -> "genbank_phages_2018_09.tar" [1]
2024-02-24 14:16:10 URL:https://wwwuser.gwdg.de/~compbiol/spacepharer/2018_09/genbank_phages_2018_09.tsv [405478/405478] -> "genbank_phages_2018_09.tsv" [1]
tar2db genbank_phages_2018_09.tar /ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/tardb --threads 40 -v 3 

Time for merging to tardb: 0h 0m 0s 81ms
Time for merging to tardb.lookup: 0h 0m 0s 409ms
Time for processing: 0h 0m 6s 409ms
createdb /ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/tardb /ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/seqdb -v 3 

Converting sequences
[8283] 1s 195ms
Time for merging to seqdb_h: 0h 0m 0s 383ms
Time for merging to seqdb: 0h 0m 4s 252ms
Database type: Nucleotide
Time for processing: 0h 0m 6s 161ms
createsetdb /ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/seqdb  /ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672 --reverse-fragments 0 --tax-mapping-file genbank_phages_2018_09.tsv --extractorf-spacer 0 --translation-table 1 --add-orf-stop 0 --compressed 0 --threads 40 -v 3 

cp: '/ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/seqdb' and '/ibex/user/naras0c/spacepharer_test/tmpFolder/9610124632266045672/seqdb' are the same file
Error: createsetdb failed

Perhaps I am doing something wrong here..?

EDIT/UPDATE

I tried to issue the createsetdb by myself as follows:

$ tar xvf tmpFolder/9610124632266045672/genbank_phages_2018_09.tar
< stdout of unpacking the fna.gz files >

$ mkdir phages # Create and move them to a different folder
$ mv *.fna.gz phages/
$ rm -rf databases/* # Clean up output directory
$ spacepharer createsetdb phages/*.fna.gz databases/targetSetDb tmpFolder/
< bunch of stdout >
$ spacepharer createsetdb phages/*.fna.gz databases/targetSetDb_rev tmpFolder/ --reverse-fragments 1
< bunch of stdout >

This seems to have worked with no issue. So, I proceeded to run the command that I wanted to run with the example CRISPR data:

$ spacepharer easy-predict examples/crisprdetect_test examples/pilercr_test databases/targetSetDb predictions.tsv tmpFolder
< bunch of stdout >

The predictions file is not empty this time around.

Let me know if you need more information :)