Closed mw55309 closed 7 years ago
Hi Mick, the reason for the N/A's is that either it can't find the TAXIDs of the sequences (because $BLASTDB is not pointing to NCBI's TaxDB) or there are no TAXIDs assigned to the sequences in the DB.
Is this NCBI's nr or a custom subset?
wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz
export BLASTDB=$BLASTDB:~/FOLDER/WHERE/TAXDB/LIVES
makeblastdb -in blastdb.fasta -dbtype nucl -taxid_map blastdb.id_to_taxid.txt -parse_seqids
I just had to do something similar where I wanted to construct a new BlastDB with ONLY subjects hit by a previous search against NCBI nt:
# extract subject_ids from blast result (column 5 is sseqid in my case)
cut -f5 megablast.out | sort | uniq > subject_ids.txt
# Get tax_id_map
blastdbcmd -db nt -entry_batch subject_ids.txt -outfmt '%i %T' -out subject_sequences.tax_id_map
# Get sequences
blastdbcmd -db nt -entry_batch subject_ids.txt -out subject_sequences.fna
# make DB
makeblastdb -in subject_sequences.fna -dbtype nucl -taxid_map subject_sequences.tax_id_map
-parse_seqids
Hope that helps.
cheers,
dom
Hi Dom
Sorry, I know this isn't a problem with blobtools per se....
My blast+ output looks like this:
tig00000001 N/A 54.3 tig00000001 N/A 53.9 tig00000001 N/A 53.1 tig00000001 N/A 53.5 tig00000001 N/A 53.9 tig00000001 N/A 47.8 tig00000001 N/A 27.7 tig00000001 N/A 53.1 tig00000001 N/A 52.4 tig00000001 N/A 53.1 tig00000001 N/A 52.4 tig00000001 N/A 53.1 tig00000001 N/A 51.6 tig00000001 N/A 53.1 tig00000001 N/A 52.8 tig00000001 N/A 52.8 tig00000001 N/A 53.5 tig00000001 N/A 53.1 tig00000001 N/A 52.8 tig00000001 N/A 52.8 tig00000001 N/A 52.4 tig00000001 N/A 52.4
Obviously N/A us not very useful.
Command ran was:
blastx -db nr_blastplus -query test.fa -outfmt '6 qseqid staxids bitscore' -num_threads 8
Building the database was:
zcat nr.gz | makeblastdb -in - -parse_seqids -dbtype prot -title nr_blastplus -out nr_blastplus
Have you seen this type of behaviour before?
Cheers Mick