barricklab / breseq

breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA resequencing data. It is intended for haploid microbial genomes (<20 Mb). breseq is a command line tool implemented in C++ and R.
http://barricklab.org/breseq
GNU General Public License v2.0
137 stars 21 forks source link

Mutation not matching annotation #366

Closed erikwolfsohn closed 5 months ago

erikwolfsohn commented 5 months ago

I'm not sure if this is an error or if I'm just not interpreting my output correctly. I have a C -> T mutation which is supported by the read alignment evidence, but the annotation is G257D GGT -> GAT. I see there are a couple of isolated reads in there that match the annotation, but the majority do not:

image
jeffreybarrick commented 5 months ago

Here's how to interpret things:

The mutation and read alignment are with respect to the top genome strand, where there is a C->T change. Because the ompA gene is on the reverse strand, the codon change in the gene is GGT->GAT as shown in the annotation column (the C->T change is G->A on the reverse strand). The arrow by the ompA gene gives the strand of the genome it is on (reverse).

erikwolfsohn commented 5 months ago

Got it, thank you! When the predicted mutation is in an intergenic region, does the arrow indicate what strand and what genes the mutation falls between? The way I'm interpreting this is a junction indicating an insertion on the forward strand between SD200_17140 and iroB. Is that correct?

image
jeffreybarrick commented 5 months ago

The mutation is always described on the top strand.

The gene here for an intergenic mutation gives the strand of the two adjacent genes on both sides of the slash.

SD200_17140 is in the forward (top strand) direction. Same for iroB.