ohnosequences / mg7

Configurable and scalable 16S metagenomics data analysis
https://goo.gl/y3rZFD
GNU Affero General Public License v3.0
3 stars 3 forks source link

Review default Blast output format #36

Closed laughedelic closed 8 years ago

laughedelic commented 8 years ago

Here is the current default:


Changes:

eparejatobes commented 8 years ago

Yes, sgi should be removed

eparejatobes commented 8 years ago

needs ohnosequences/blast-api#27

eparejatobes commented 8 years ago

need to update to ohnosequences/blast-api 0.5.0

laughedelic commented 8 years ago

What else has to be changed regarding default Blast output format? Add evalue column?

rtobes commented 8 years ago

This is probably related with your question: https://github.com/ohnosequences/pacbio16s/issues/3#issuecomment-198767138

eparejatobes commented 8 years ago
case object defaultBlastOutputRecord extends BlastOutputRecord(
  // query
  qseqid      :×:
  qstart      :×:
  qend        :×:
  qlen        :×:
  // reference
  sseqid      :×:
  sstart      :×:
  send        :×:
  slen        :×:
  // alignment
  evalue      :×:
  score       :×:
  bitscore    :×:  
  length      :×:
  pident      :×:  
  mismatch    :×:    
  gaps        :×:  
  gapopen     :×:  
  qcovs       :×:
  |[AnyBlastOutputField]
)
rtobes commented 8 years ago

What is qcovs, coverage of the alignment over the length of the query?

eparejatobes commented 8 years ago

@rtobes is hard to find a precise definition. This is the best I got:

rtobes commented 8 years ago

Thanks!

It is the percent of no. of bases in the query sequence aligned with the subject sequence (match or mismatch). The bases can be in one HSP or several HSPs (overlap) but they are counted only once. Gaps (in the subject sequence) are treated as mismatches.

If this final sentence is true qcovs is a really good parameter

eparejatobes commented 8 years ago

yup, that's why it's there in defaults :+1:

laughedelic commented 8 years ago

Done.