FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data
BSD 3-Clause "New" or "Revised" License
185 stars 75 forks source link

Optitype on different sequencing #78

Closed ktroule closed 6 years ago

ktroule commented 6 years ago

Hi

I have downloaded a few TCGA chr6 bam files which after converting them to fastq I've run in Optitype (docker version). I've used WXS and RNA for the available sample types (normal tissue, normal blood, primary tumor, etc).

Once done this I've tried to check the consistency of the HLA types for each patient across all its samples types (a pairwise comparison), no allowing mistmaches in any of the allele calls.

That means that if for a give patient I have these samples; A: WXS tumor B: WXS blood C: RNA tumor D: RNA blood I've compared A vs B, A vs C, A vs D, B vs C, B vs D and C vs D.

After this comparison about a 30% of comparison showed at least one mistmach in the calls.

When I compare only WXS data about 12% of comparison have at least one mistmached call, when I do it for just RNA about 10% of comparison show at least a mistmached HLA call.

So from your experience. Is this normal? Would you expect such differences between WXS calls and RNA calls? If so which one would you recommend trusting the most? I would expect the most trustable data source to be WXS Normal samples.

Once more, thanks for your time.

andras86 commented 6 years ago

Hi,

Sounds a bit high, but not overly so. With six alleles per sample and four samples per individual if one out of 24 goes wrong you'll get a mismatch. Given OT's 98-odd percent allele-wise accuracy your 30% figure sounds quite realistic: major simplification, but 0.985^24 = 0.7

As for which one you should trust most, I'd go with WXS Normal as well. You might be able to confirm that on your data, whether they show best concordance with the "consensus". But feel free to send me a table, I can have a look.