mattb112885 / clusterDbAnalysis

ITEP - Integrated Toolkit for Exploration of microbial Pan-genomes
26 stars 15 forks source link

Building RPSBLAST databases no longer works with current CDD data #45

Closed mattb112885 closed 11 years ago

mattb112885 commented 11 years ago

Apparently NCBI changed the format of the CDD data so only the newer version of RPSBLAST works with it - so I should be using makeprofiledb to build the database instead of formatrpsdb. There's a bit of a naming nightmare that I hope we don't have too many issues with... namely, they didn't change the name of the rpsblast executable but changed the syntax. We might just have to switch to the new syntax and let it crash if people are using the wrong one...

mattb112885 commented 11 years ago

Also need to fix the RPSBLAST outputs which no longer match up with names of external domains... sigh.

mattb112885 commented 11 years ago

I pushed up a bunch of changes that I hope will fix this problem - if nothing else it will at least be a lot closer and I'll just need to make minor changes due to things that were oversighted. Expect this to be closed soon.

mattb112885 commented 11 years ago

Well.... RPSBLAST now runs just fine and reports results, but it only reports results to some databases and not to all of them (e.g. COG and PFAM are definitely in the source data but for some reason the hits we expect to them don't show up in the results). I think that might be the BLAST compiler's fault rather than my own. Will try to figure out what's going on with this but there might not be much I can do...

mattb112885 commented 11 years ago

Nevermind, it seems to work OK (maybe could use some setting tweaking though). Closing this...