cfe-lab / MiCall

Pipeline for processing FASTQ data from an Illumina MiSeq to genotype human RNA viruses like HIV and hepatitis C
https://cfe-lab.github.io/MiCall
GNU Affero General Public License v3.0
14 stars 9 forks source link

HLA data does not cover whole region #97

Closed donkirkby closed 10 years ago

donkirkby commented 10 years ago

So far, I've only looked at the HLA coverage for the 22 May 2014 run, but that run consistently has good coverage in two sections of the HLA-B region and no coverage in the rest of the region. This means that all of the samples scored badly, because we are currently expecting coverage of the whole region. The first two samples have the following coverage in HLA-B (nucleotide coordinates):

The HLA coverage map and scoring doesn't currently support key positions, so it will need to be adapted to use something similar to the other regions.

donkirkby commented 10 years ago

Our primers cover the two exons plus a little extra. Chanson is going to tell me the coordinates of the two exons within the whole HLA-B sequence, and we will use those as our key positions.

donkirkby commented 10 years ago

Here's the request from Chanson:

Exon 2 should be positions 486-755 (inclusive) of your HLA-B reference. Exon 3 is 1002-1277 That's 1-based numbering :)

Those numbers are inside the coverage that I see, so we should get nice coverage scores.

New Question

Do we still want to map the entire HLA-B region? We're mapping more than double the width of the exons. If we restricted the region we map, the coverage plots would be easier to see.