mattb112885 / clusterDbAnalysis

ITEP - Integrated Toolkit for Exploration of microbial Pan-genomes
26 stars 15 forks source link

BLASTN all vs all #3

Closed mattb112885 closed 12 years ago

mattb112885 commented 12 years ago

We should modify the codes to do a blastn all vs all in a similar manner to blastp - useful for closely-related organisms. Switches could be added to much of the code to query a different table but use exactly the same analysis otherwise. Table structure should be exactly the same but using nucleotide data rather than amino acid data as inputs.

mattb112885 commented 12 years ago

The code for blast_all_vs_all.py has been modified to optionally work with BLASTN rather than BLASTP. The database has not yet been added nor have any extraction routines been modified to work with the data.

mattb112885 commented 12 years ago

The sql code has been updated to add the BLASTN results to the database (which increases the time it takes to run main.sh - but that step should be walked away from anyway since running BLAST itself takes forever). In addition I have modified the db_getBlastResults* functions to optionally return BLASTN results instead of BLASTP results (BLASTP is still the default). The structure of the table is exactly the same (including the self-bit scores added to the last columns) so it should still work with other functions (such as makeBlastScoreTable.py) that require use of the BLAST results. I'm considering this bug closed.