filip-husnik / pseudofinder

Detection of pseudogene candidates in bacterial and archaeal genomes.
GNU General Public License v3.0
42 stars 16 forks source link

nr database deletion #41

Closed GraessleT closed 2 years ago

GraessleT commented 2 years ago

Hey,

when trying to run pseudofinder.py in the test module I get this error message:

Non-zero return code 2 from 'blastp -out /scratch/user/pseudofinder/test/20220817_results_2/test_proteome.faa.blastP_output.tsv -outfmt 6 qseqid sseqid pident slen mismatch gapopen qstart qend sstart send evalue bitscore stitle -query /scratch/user/pseudofinder/test/20220817_results_2/test_proteome.faa -db /home/user/scratch/databases/BLAST_env_nr_DATABASE/env_nr -evalue 1e-4 -max_target_seqs 15 -max_hsps 1 -num_threads 4', message 'BLAST Database error: No alias or index file found for protein database [/home/user/scratch/databases/BLAST_env_nr_DATABASE/env_nr] in search path [/scratch/user/databases/BLAST_env_nr_DATABASE::]' Also all files (with the extensions .phd, .phi, .phr, .pin ...) in the database folder (here BLAST_env_nr_DATABASE) are "deleted" except of taxdb.btd and taxdb.bti. Using the same database outside of pseudofinder with blastp works fine.

Thankful for any advice what might cause this issue.

Regards

mitchso commented 2 years ago

Hello! To clarify, if you take the exact command given in the error above and execute it outside of pseudofinder, do you still encounter an error?

blastp -out /scratch/user/pseudofinder/test/20220817_results_2/test_proteome.faa.blastP_output.tsv -outfmt 6 qseqid sseqid pident slen mismatch gapopen qstart qend sstart send evalue bitscore stitle -query /scratch/user/pseudofinder/test/20220817_results_2/test_proteome.faa -db /home/user/scratch/databases/BLAST_env_nr_DATABASE/env_nr -evalue 1e-4 -max_target_seqs 15 -max_hsps 1 -num_threads 4

If this runs without error, then let me know and we can investigate what might be happening with pseudofinder. If this does cause an error then it can hopefully guide you in fixing some issue with your input files.

mitchso commented 2 years ago

Also pseudofinder should not be writing or deleting anything in your database folder, is it possible those files were deleted from a different issue? I would be interested to know if you are able to reproduce the deletion of files using our software because that certainly shouldn't be happening.

GraessleT commented 2 years ago

Thanks! Yes when running the exact command outside pseudofinder on the same database (rebuild because it was "deleted")blastp runs succesfully. Here the first 4 entries of the test_proteome.faa.blastP_output.tsv

`KKHFFIBD_00001 gb|ECV10727.1| 50.964 471 219 4 2 461 3 466 2.63e-153 449 hypothetical protein GOS_2958623, partial [marine metagenome]

KKHFFIBD_00001 gb|MNM26284.1| 50.557 467 214 4 4 447 3 448 2.63e-145 429 3-isopropylmalate dehydratase large subunit [compost metagenome]

KKHFFIBD_00001 gb|OIQ77378.1| 49.580 487 223 4 3 461 7 482 8.80e-144 426 3-isopropylmalate dehydratase large subunit [mine drainage metagenome]

KKHFFIBD_00001 gb|KKN07870.1| 48.608 468 231 5 3 463 4 467 8.49e-143 422 hypothetical protein LCGC14_1062610 [marine sediment metagenome]`

I pulled the newest pseudofinder version again this morning and reinstalled everything but unfortunately this didn't resolve my problem.

After activating the pseudofinder conda env I first run export BLASTDB=PATH/to/DB

Then the command pseudofinder.py test -db /PATHto/DB/env_nr And I get the same error as above and the database is also gone again.

mitchso commented 2 years ago

Hi, I've made a small change which at this moment is my best guess at what has happened for you, since I cannot reproduce this on my end. Could you please pull the latest changes and try once more?

Thanks, Mitch

GraessleT commented 2 years ago

Thanks Mitch! Now it works 👍