ncbi / elastic-blast

ElasticBLAST is a cloud-based tool to perform your BLAST searches faster and make you more effective
https://blast.ncbi.nlm.nih.gov/doc/elastic-blast
Other
45 stars 15 forks source link

Missing json data #11

Closed nikkiing closed 1 year ago

nikkiing commented 1 year ago

image image Why does the json file not contain the overview data, or how the data should be calculated as follows Json cannot use - sortits 4

nikkiing commented 1 year ago

cmd: blastn -task blastn -db ~/temp/mix/m -query ~/temp/datatest/gene.fna -evalue 10 -outfmt 0 -sorthits 4 -out ~/temp/go/test1 -word_size 11 json: blastn -task blastn -db ~/temp/mix/m -query ~/temp/datatest/gene.fna -evalue 10 -outfmt 13 -out ~/temp/go/test1 -word_size 11

tom6931 commented 1 year ago

I'm sorry but the json report does not support that information. I think the best way to get it (or come close) is to use the tabular format (-outfmt 6/7/10). Below I have an example that includes the "standard" fields (hence "std") plus one for query coverage.

You could get the maximum identity by taking the maximum of the identity for multiple HSPs. For example, there are three matches to NC_000020.11, so you'd take the max ident there (95.413%). You could also get the total score by adding the scores for the three HSPs.

I'm sorry it took so long to get back to you. Let me know if you have any more questions.

Tom

[madden@blastdev11 ~/temp]$ blastn -db GPIPE/9606/110/GCF_000001405.40_top_level -query u00001.fsa -outfmt "7 std qcovs"

BLASTN 2.13.0+

Query: U00001.1 Human homologue of S. pombe nuc2+ and A. nidulans bimA

Database: GPIPE/9606/110/GCF_000001405.40_top_level

Fields: query acc.ver, subject acc.ver, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score, % query coverage per subject

35 hits found

U00001.1 NC_000002.12 95.563 1961 76 7 15 1971 132261143 132263096 0.0 3129 76 U00001.1 NC_000020.11 95.413 1962 82 4 15 1971 30492928 30494886 0.0 3118 76 U00001.1 NC_000020.11 95.207 1961 88 5 15 1971 29746135 29748093 0.0 3096 76 U00001.1 NC_000020.11 94.954 1962 94 2 15 1971 30919628 30917667 0.0 3070 76 U00001.1 NC_000014.9 94.959 1964 91 6 15 1971 16049750 16047788 0.0 3072 76