filip-husnik / pseudofinder

Detection of pseudogene candidates in bacterial and archaeal genomes.
GNU General Public License v3.0
44 stars 16 forks source link

database #39

Closed vappiah closed 2 years ago

vappiah commented 2 years ago

Hi All,

I am using pseudofinder to find pseudogenes in Vibrio cholerae isolates. I am using the blast nr database, but the blast stage is taking long on just one sample (over 24 hours and still running on one sample.)

The number of threads I am using is 36

Please advice.

Arkadiy-Garber commented 2 years ago

Hi Vincent,

Thanks for your interest in pseudofinder. What is the size of the blast database that you are using?

It is possible that the BLAST stage will take this long to run. We recommend trying a run with the --diamond flag, which uses DIAMOND instead of BLAST, and takes significantly faster to run.

Thanks again, Arkadiy

vappiah commented 2 years ago

HI @Arkadiy-Garber

I tried using the diamond flag and I got this error _Error when running DIAMOND: Error: This executable was not compiled with support for BLAST databases.

This is the command I used pseudofinder.py annotate --genome $dir/$bname.gbk --database $blastdb/nr --diamond --threads $threads --reference $refchrome2 --outprefix $pseudodir/$bname/$bname

mitchso commented 2 years ago

Hi Vincent,

This is an error raised when DIAMOND is run so I would recommend troubleshooting by trying to run diamond alone, on any sample, using the database you have provided. I believe this is just a database formatting issue - diamond needs a different format of database (should end with .dmnd) that can be generated using the command diamond makedb. See here: https://github.com/bbuchfink/diamond/wiki/3.-Command-line-options

If you can get your diamond run to work without errors, then you can return to pseudofinder and it should run just fine.

Best, Mitch