Closed mbhall88 closed 2 years ago
Alright, using the classifications mentioned above (i.e. minor resistance, null, and filtered calls are all ignored), at the mutation level, there are 0 FNs and 4 FPs across all 151 samples.
Awesome! In answer to your question, I was going to suggest, treat minor calls from Mykrobe as S. That's what Mykrobe does when you give it the appropriate cmd line arg to ignore minors. But anyway this is great news
When I treat minor resistance as susceptible, we get 14 FPs and 1 FN.
The FN is described in https://github.com/Mykrobe-tools/mykrobe/issues/139 and we can effectively ignore I think.
3 of the FPs are a homopolymer deletion, which affects three consencutive positions (so it's kind of 1 FP I guess).
1 FP is a katG deletion (CC->C), but there are other mutations in katG so the resistance call is not impacted.
10 FPs are Illumina minor resistance calls that we are now treating as S and Nanopore is saying R. I've looked through all of these and realised that treating minor as S is probably not the right/fair thing to be doing here. For instance, "minor" resistance does not actually mean that the minor allele is the resistant one. We could have the major allele being resistance and the minor being S. In these cases (which is all 10 FPs) it actually makes more sense from a genotyping perspective to say they are resistant. In all 10 FPs, the ALT allele has much higher coverage than the REF.
So I think the fair thing to do is switch to using a haploid model for Illumina data - which is what we're doing for Nanopore.
When using a haploid model for both Illumina and Nanopore, we get 4 FPs and 1 FN. These are the same 4 FPs relating to indels mentioned above. The 1 FN is also the same weird one mentioned above.
Confused about the difference between "minor is S" and a haploid model.
Haploid model means just tell Mykrobe via cmd like to ignore minors. Minor as S means run default Mykrobe and flip r to S. Right? Oh no maybe the emphasis is really about doing the same for illumina and Nanopore
Confused about the difference between "minor is S" and a haploid model.
Let's say mykrobe has called 'r' and the REF/ALT median depth is 8/54 (a real example). I would argue this is "major" resistance and "minor susceptibility". Switching r->S does not seem like the smart thing to do in this example. However, if we use a haploid model, Cortex makes the decision about whether the REF or the ALT is the most likely call.
In addition, as you mentioned, both Illumina and Nanopore are then also using the same model - which seems fair?
Ah yes,I 100% agree
Our current approach to the genotype DST concordance is to check whether the susceptibility calls for each drug are the same between Illumina and Nanopore.
There is an additional, fine-grained analysis, which is to see whether the genotype calls for each mutation in the catalogue being used are the same.
The way this is going to be analysed is, for each sample, take the Illumina and Nanopore JSON output with all mutations and go through each mutations and check whether the genotype call is the same. If the variant is filtered, we ignore it.
@iqbal-lab One question is what to do with minor resistance (i.e., het genotype) calls (https://github.com/mbhall88/head_to_head_pipeline/issues/75#issuecomment-962293131) and null calls? I'm going to assume we want to treat these at "filtered" and ignore them. What I mean by this is if either the Illumina or Nanopore call for a mutation is null, het, or filtered, we treat as filtered and don't count the comparison in the concordance analysis.