Closed mbhall88 closed 1 year ago
I took a look at the INH FNs today. For those where mykrobe made a TP call (so I can see which mutation we missed) 13/16 were use missing fabG1 C-15T promoter mutation. drprg called this variant, for all of those 13, but they were filtered by the fraction of read support (FRS) filter - which is set to 0.70. Nearly all of those mutations had an FRS of 0.58-0.64.
The reason for this is the alleles are quite similar, and I suspect maybe some shared minimizers are wreaking havoc here.
An example VCF record showing the alleles and coverage
fabG1 81 555cbd3d CGAGACGATAGGT CGAGACGATAGGC,CGAGATGATAGGT,TGAGACGATAGGT . frs VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=fabG1_G-17T,fabG1_A-16X,fabG1_C-15X,fabG1_T-8X;PREDICT=S,S,R,S GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 2:13,16,22,3:11,13,27,2:7,14,19,0:4,8,30,1:82,82,134,14:71,68,164,10:0.5,0.4,0,0.75:-441.21,-406.243,-273.794,-577.451:132.448
These dont get collapsed by make PRG because the minimum match lengths between the three variants described by this allele are 4 and 6 (we use min match len of 7).
The other interesting thing is that each time, the allele with the next best coverage is allele 1 which differs in two positions from allele 2 (middle and end), so I reckon there's a minimizer that covers the start of this allele before the two alleles differ.
Not sure whether the "hacky" way of decreasing the FRS threshold is the best way to go? Or changing some parameters in make prg or pandora...
Could drop min match length also? In our covid work using pileups, we find frs of 0.7 is too high just because of noise in the reads
Interesting. What FRS have you been using in your covid work?
Well, we've just shifted to 0.6 for nanopore, but now we're distracted fixing bugs before going back to carefully choose FRS thresholds
Okay, so after changing the minimum match length to 5 and the minimum FRS to 0.60, there are only two FNs that mykrobe calls that we don't. One of those is an indel which fails the FRS filter at 0.59 and the other is a dodgey looking indel call from mykrobe that isn't called by tbprofiler or drprg, so I'm not phased about that. Interestingly, that sample has a synonymous SNP in the first codon.
The allele at that fabG1 variant now looks slightly better and has been split in two
fabG1 81 8ca378d9 CGAGAC CGAGAT,TGAGAC . PASS VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=fabG1_G-17T,fabG1_A-16X,fabG1_C-15X;PREDICT=S,S,R GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 1:11,16,6:12,25,2:9,15,0:5,28,1:71,101,25:75,155,11:0.5,0,0.75:-281.052,-151.402,-388.948:129.65
fabG1 93 fa7956b3 T C . ld;sb VC=SNP;GRAPHTYPE=SIMPLE;VARID=fabG1_T-8X;PREDICT=S GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 0:0,0:1,0:0,0:1,0:0,0:2,0:1,1:-129.795,-138.605:8.80986
Next task is to dig into the poor ofloxacin sensitivity.
The OFX FNs are a fairly straightforward fix. It turns out that all of the FNs are gyrA D94X. What is happening here is that there is a silent mutation (does not confer resistance) at codon 95 (S95T) that occurs in the same allele as that variant in the VCF/PRG. Long story short, I end up combining these two variants and calling an unknown prediction for OFX with novel variant gyrA_DS94GT
. So I just need to break these up and check whether any of them are in the panel and associated with resistance. Should hopefully finish the implementation tomorrow.
Great!
Here's the updated plots after the INH and OFX fixes listed above
Some good improvements for nanopore. I'm going to have a look at the drprg STM, ETO, PZA and RIF sensitivity as mykrobe seems to be better than drprg. But pretty happy with the specificity of drprg at the moment.
Looks good. Not sure what going on with TB profiler with RIF
On Mon, 10 Oct 2022 at 16:37, Michael Hall @.***> wrote:
Here's the updated plots after the INH and OFX fixes listed above Nanopore
[image: image] https://user-images.githubusercontent.com/20403931/194803916-6afb90ec-f32d-463d-98e3-a6e342a4ec48.png Illumina
[image: image] https://user-images.githubusercontent.com/20403931/194803927-c55a2ad4-5264-4a70-913e-6da61b54a33e.png
Some good improvements for nanopore. I'm going to have a look at the drprg STM, ETO, PZA and RIF sensitivity as mykrobe seems to be better than drprg. But pretty happy with the specificity of drprg at the moment.
— Reply to this email directly, view it on GitHub https://github.com/mbhall88/drprg-paper/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6TKZGF6CM36LGQH63SLBTWCOTRVANCNFSM6AAAAAAQWLHP7A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Looks good. Not sure what going on with TB profiler with RIF
It's most likely an issue with the custom panel I built. I'll fix that up once I've finished debugging drprg. I'm assuming tb profiler is on par with mykrobe for RIF though
Looks great. I am actually surprised how good the specificity is for PZA nanopore, given it is dominated by indels. I had thought we had issues with indels
I've now gone through the remainder of the FNs and nearly all of the FPs for nanopore.
2 FNs where drprg didn't discover the variant
4 FPs where all three tools calls rpoB L430X
1 FP where mykrobe and drprg call rpoB L452X
1 FP that just scraped through the low depth cutoff of 3 in drprg - it had a depth of 3.
1 FN where drprg calls 2 non-synonymous mutations (not in the panel) - these are also called by tb-profiler
1 FN where drprg calls the correct variant rpsL K43R but fails FRS (0.52)
14 FPs are called by all three callers. 9 were mutations in gid, 3 were in rrs, and 2 in rpsL. I'm not sure what we want to do here, because it is fairly reasonable this is a phenotyping problem. After a quick search, I found two references to support this [1, 2]. From 1
low-level streptomycin resistance mediated by gidB were frequently misclassified with respect to streptomycin resistance when using the WHO-recommended critical concentration of 2 μg/ml.
2 FPs were rpsL K88R which is very strongly associated with STM resistance. These were also called by mykrobe but not tb-profiler.
2 FPs were confident deletions in gid only called by drprg. mykrobe called one of them, but it was filtered due to low expected proportion of expected depth.
In total, there were 9 FNs which were all called by mykrobe, but not tb-profiler or drprg. They're all indels in ethA and de novo variant discovery was not triggered in drprg for any of them. In one of those FNs, there was a promoter mutation as well, which drprg did call, but it was filtered out for low FRS (0.55).
The other thing to note here is while mykrobe's sensitivity is much better than drprg and tbprofiler, it's specificity is terrible.
1 FN is pncA R154G which is called by mykrobe and tb-profiler. drprg calls the correct allele, but it is filtered out for low FRS (0.54)
2 FNs are indels called by mykrobe only. Both indels were null genotype (drprg calls this F for failed) as the depth was split evenly across two alleles.
4 FPs were called confidently by all three tools (fabG1 C-15T)
3 FPs are deletions called by drprg not called by either tool. They had below 10x depth on drprg so probably not super confident
I don't think there is much more I can do to improve drprg's nanopore performance here. FRS could possibly be lowered? but it would only save a small number of FN/FPs
I'll fix up the tb-profiler RIF sensitivity and then get stuck into the Illumina results
(This text file is my notepad while I was investigating these FNs and FPs)
As our specificity is better or the same compared to mykrobe and tbprofiler for all drugs on Illumina, I only investigated the FNs.
tl;dr there are three things we may want to try to improve sensitivity as I suspect once we scale this analysis up to thousands of samples some of these problems will get bigger
master
in the next few weeks? Or will I have to try and do it myself?Aside from those overarching points, there were also some other FNs which were due to two variants right next to each other. For example, ERR2510154 has rpoB_S450F
, which is actually caused by a 2bp MNP. One of these positions exists in the reference PRG drprg uses. The other variant gets discovered, but gets added in as a separate allele (bubble) in the PRG - I guess this is how make_prg update
works though? Here it is
rpoB 1449 fa44b92a C T . ld;lgc VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:0,0:0,0:0,0:0,0:0,0:0,1:1,1:-278,-278:0
rpoB 1450 9fbca785 G T . ld;lgc VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:0,0:0,0:0,0:0,0:0,0:0,0:1,1:-278,-278:0
The allele of this variant is TCG>TTT so pandora should be able to thread reads through both of these variants, but doesn't seem to be able to....??
A similar thing happened for a few other FNs. I'm wondering if I should run on the full dataset and then manually add to the reference PRG some of the common variants that cause this problem? I was thinking the "correct" way to do this @iqbal-lab would be to look through the cryptic metadata sheets and add some samples that contain those variants that cause some of these problems?
Hi there, I'm a bit wiped out so will be brief
In terms of adding more variants to the graph; we can do this, but racon might mean you don't need to
I've just realised, we should probably put some kind of minimum depth filter on these results too. i.e. samples with less than d depth are excluded from the sensitivity/specificity plots.
Does everyone agree? If so, does anyone have a preference for what d should be? I arbitrarily thought of 15x? (This is separate from the depth analysis in mbhall88/drprg-paper#3)
Here is the depth distribution for the 400 Illumina test set
and the full nanopore
Additionally, it might be wise to have a contamination proportion filter? For instance, when I align the reads to the decontamination database, I calculate the fraction of reads that we keep (i.e. MTB), fraction of reads that align to a contaminant, and a fraction of reads unmapped. Again, arbitrarily was thinking exclude samples with more than 5% contamination? Too harsh?
This is the fraction of contamination for Illumina
and nanopore
Both seem perfectly reasonable to me
So I have a working drprg branch adapted to use pandora with the racon denovo method (https://github.com/rmcolq/pandora/pull/299).
I've tested it out on two Illumina runs listed in https://github.com/mbhall88/drprg-paper/issues/2.
The first, ERR2510154, was the example VCF above. So, with the old pandora denovo process, at the allele for rpoB_S450F
we had
rpoB 1449 fa44b92a C T . ld;lgc VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:0,0:0,0:0,0:0,0:0,0:0,1:1,1:-278,-278:0
rpoB 1450 9fbca785 G T . ld;lgc VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:0,0:0,0:0,0:0,0:0,0:0,0:1,1:-278,-278:0
with the new denovo process, we get
rpoB 1449 ba1603d4 CG TG,TT . PASS VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=S,S,S,R,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 2:0,0,51:0,0,41:0,0,61:0,0,48:0,0,308:0,1,251:1,1,0.166667:-703.676,-703.676,-35.8875:667.788
I guess another reason this might have been fixed is that instead of using make_prg update
to add the denovo sequences into the PRG, we recreate the MSAs and rebuild the PRGs for those genes with novel variants. So this particular case could be a weakness of make_prg update
as it just updated with the novel variant - rpoB 1450 G>T - without combining it with the previous position into a single allele.
The second run, ERR4828599, had both a RIF FN and an INH FN. The RIF FN was an interesting case where the isolate has both L449M and S450F. drprg/pandora previously failed to find a novel variant. With the new pandora denovo process, we found (and called) both of these variants. The INH FN was katG S315N, which is a rarer mutation at that locus - normally S315T. Previously both mykrobe and drprg had no depth in this area and drprg did not find a novel variant. With the new pandora, we do find and call this mutation.
I'm going to run a few more of the Illumina FNs, but this is very promising!
Okay, since we had a last update of results we have switch to using racon for denovo discovery and dropped the old nanopore data. I have also increased the number of illumina samples to 8,587
Note I am going to change the markers so you can see the error bars now that they are so small
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 77(485) | 50(6958) | 84.1% (80.6-87.1%) | 99.3% (99.1-99.5%) | 0.857 |
Amikacin | mykrobe | 101(485) | 46(6958) | 79.2% (75.3-82.6%) | 99.3% (99.1-99.5%) | 0.831 |
Amikacin | tbprofiler | 62(485) | 59(6958) | 87.2% (83.9-89.9%) | 99.2% (98.9-99.3%) | 0.866 |
Capreomycin | drprg | 62(235) | 92(2449) | 73.6% (67.6-78.8%) | 96.2% (95.4-96.9%) | 0.662 |
Capreomycin | mykrobe | 78(235) | 85(2449) | 66.8% (60.6-72.5%) | 96.5% (95.7-97.2%) | 0.625 |
Capreomycin | tbprofiler | 54(235) | 96(2449) | 77.0% (71.2-81.9%) | 96.1% (95.2-96.8%) | 0.679 |
Delamanid | drprg | 111(116) | 1(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.188 |
Delamanid | mykrobe | 111(116) | 1(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.188 |
Delamanid | tbprofiler | 111(116) | 2(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 146(1538) | 736(4936) | 90.5% (88.9-91.9%) | 85.1% (84.1-86.1%) | 0.685 |
Ethambutol | mykrobe | 149(1538) | 728(4936) | 90.3% (88.7-91.7%) | 85.3% (84.2-86.2%) | 0.686 |
Ethambutol | tbprofiler | 118(1538) | 765(4936) | 92.3% (90.9-93.6%) | 84.5% (83.5-85.5%) | 0.691 |
Ethionamide | drprg | 341(1104) | 372(6105) | 69.1% (66.3-71.8%) | 93.9% (93.3-94.5%) | 0.623 |
Ethionamide | mykrobe | 276(1104) | 395(6105) | 75.0% (72.4-77.5%) | 93.5% (92.9-94.1%) | 0.658 |
Ethionamide | tbprofiler | 272(1104) | 414(6105) | 75.4% (72.7-77.8%) | 93.2% (92.6-93.8%) | 0.653 |
Isoniazid | drprg | 362(3900) | 164(4194) | 90.7% (89.8-91.6%) | 96.1% (95.5-96.6%) | 0.871 |
Isoniazid | mykrobe | 366(3900) | 163(4194) | 90.6% (89.7-91.5%) | 96.1% (95.5-96.7%) | 0.87 |
Isoniazid | tbprofiler | 297(3900) | 181(4194) | 92.4% (91.5-93.2%) | 95.7% (95.0-96.3%) | 0.882 |
Kanamycin | drprg | 142(670) | 101(6975) | 78.8% (75.6-81.7%) | 98.6% (98.2-98.8%) | 0.796 |
Kanamycin | mykrobe | 166(670) | 96(6975) | 75.2% (71.8-78.3%) | 98.6% (98.3-98.9%) | 0.776 |
Kanamycin | tbprofiler | 122(670) | 107(6975) | 81.8% (78.7-84.5%) | 98.5% (98.1-98.7%) | 0.811 |
Levofloxacin | drprg | 105(1040) | 97(5454) | 89.9% (87.9-91.6%) | 98.2% (97.8-98.5%) | 0.884 |
Levofloxacin | mykrobe | 108(1040) | 97(5454) | 89.6% (87.6-91.3%) | 98.2% (97.8-98.5%) | 0.882 |
Levofloxacin | tbprofiler | 85(1040) | 109(5454) | 91.8% (90.0-93.3%) | 98.0% (97.6-98.3%) | 0.89 |
Linezolid | drprg | 49(65) | 4(6110) | 24.6% (15.8-36.3%) | 99.9% (99.8-100.0%) | 0.441 |
Linezolid | mykrobe | 49(65) | 4(6110) | 24.6% (15.8-36.3%) | 99.9% (99.8-100.0%) | 0.441 |
Linezolid | tbprofiler | 48(65) | 5(6110) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.447 |
Moxifloxacin | drprg | 60(603) | 464(5431) | 90.0% (87.4-92.2%) | 91.5% (90.7-92.2%) | 0.656 |
Moxifloxacin | mykrobe | 59(603) | 460(5431) | 90.2% (87.6-92.3%) | 91.5% (90.8-92.2%) | 0.658 |
Moxifloxacin | tbprofiler | 42(603) | 482(5431) | 93.0% (90.7-94.8%) | 91.1% (90.3-91.9%) | 0.668 |
Ofloxacin | drprg | 31(105) | 4(424) | 70.5% (61.2-78.4%) | 99.1% (97.6-99.6%) | 0.782 |
Ofloxacin | mykrobe | 32(105) | 4(424) | 69.5% (60.2-77.5%) | 99.1% (97.6-99.6%) | 0.776 |
Ofloxacin | tbprofiler | 26(105) | 6(424) | 75.2% (66.2-82.5%) | 98.6% (96.9-99.3%) | 0.802 |
Pyrazinamide | drprg | 75(341) | 47(822) | 78.0% (73.3-82.1%) | 94.3% (92.5-95.7%) | 0.742 |
Pyrazinamide | mykrobe | 73(341) | 45(822) | 78.6% (73.9-82.6%) | 94.5% (92.8-95.9%) | 0.751 |
Pyrazinamide | tbprofiler | 45(341) | 62(822) | 86.8% (82.8-90.0%) | 92.5% (90.4-94.1%) | 0.782 |
Rifampicin | drprg | 142(3222) | 166(4586) | 95.6% (94.8-96.2%) | 96.4% (95.8-96.9%) | 0.919 |
Rifampicin | mykrobe | 187(3222) | 165(4586) | 94.2% (93.3-95.0%) | 96.4% (95.8-96.9%) | 0.907 |
Rifampicin | tbprofiler | 102(3222) | 177(4586) | 96.8% (96.2-97.4%) | 96.1% (95.5-96.7%) | 0.927 |
Streptomycin | drprg | 278(1042) | 130(1205) | 73.3% (70.6-75.9%) | 89.2% (87.3-90.8%) | 0.637 |
Streptomycin | mykrobe | 295(1042) | 132(1205) | 71.7% (68.9-74.3%) | 89.0% (87.2-90.7%) | 0.621 |
Streptomycin | tbprofiler | 257(1042) | 136(1205) | 75.3% (72.6-77.9%) | 88.7% (86.8-90.4%) | 0.649 |
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Amikacin | mykrobe | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Amikacin | tbprofiler | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Capreomycin | drprg | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Capreomycin | mykrobe | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Capreomycin | tbprofiler | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Ethambutol | drprg | 4(14) | 15(77) | 71.4% (45.4-88.3%) | 80.5% (70.3-87.8%) | 0.42 |
Ethambutol | mykrobe | 4(14) | 15(77) | 71.4% (45.4-88.3%) | 80.5% (70.3-87.8%) | 0.42 |
Ethambutol | tbprofiler | 5(14) | 15(77) | 64.3% (38.8-83.7%) | 80.5% (70.3-87.8%) | 0.367 |
Ethionamide | drprg | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Ethionamide | mykrobe | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Ethionamide | tbprofiler | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Isoniazid | drprg | 9(51) | 4(48) | 82.4% (69.7-90.4%) | 91.7% (80.4-96.7%) | 0.742 |
Isoniazid | mykrobe | 9(51) | 4(48) | 82.4% (69.7-90.4%) | 91.7% (80.4-96.7%) | 0.742 |
Isoniazid | tbprofiler | 9(51) | 3(48) | 82.4% (69.7-90.4%) | 93.8% (83.2-97.9%) | 0.764 |
Kanamycin | drprg | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Kanamycin | mykrobe | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Kanamycin | tbprofiler | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Moxifloxacin | drprg | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Moxifloxacin | mykrobe | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Moxifloxacin | tbprofiler | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Ofloxacin | drprg | 0(10) | 4(77) | 100.0% (72.2-100.0%) | 94.8% (87.4-98.0%) | 0.823 |
Ofloxacin | mykrobe | 0(10) | 4(77) | 100.0% (72.2-100.0%) | 94.8% (87.4-98.0%) | 0.823 |
Ofloxacin | tbprofiler | 0(10) | 3(77) | 100.0% (72.2-100.0%) | 96.1% (89.2-98.7%) | 0.86 |
Pyrazinamide | drprg | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Pyrazinamide | mykrobe | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Pyrazinamide | tbprofiler | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Rifampicin | drprg | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Rifampicin | mykrobe | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Rifampicin | tbprofiler | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Streptomycin | drprg | 2(8) | 14(83) | 75.0% (40.9-92.9%) | 83.1% (73.7-89.7%) | 0.398 |
Streptomycin | mykrobe | 2(8) | 27(83) | 75.0% (40.9-92.9%) | 67.5% (56.8-76.6%) | 0.25 |
Streptomycin | tbprofiler | 2(8) | 12(83) | 75.0% (40.9-92.9%) | 85.5% (76.4-91.5%) | 0.43 |
Looks pretty good I think
On Thu, 24 Nov 2022, 1:06 pm Michael Hall, @.***> wrote:
Okay, since we had a last update of results we have switch to using racon for denovo discovery and dropped the old nanopore data. I have also increased the number of illumina samples to 8,587 Illumina
Note I am going to change the markers so you can see the error bars now that they are so small
[image: image] https://user-images.githubusercontent.com/20403931/203677683-39591450-c68e-47a8-9b2c-d77f511c5b0e.png Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC Amikacin drprg 77(485) 50(6958) 84.1% (80.6-87.1%) 99.3% (99.1-99.5%) 0.857 Amikacin mykrobe 101(485) 46(6958) 79.2% (75.3-82.6%) 99.3% (99.1-99.5%) 0.831 Amikacin tbprofiler 62(485) 59(6958) 87.2% (83.9-89.9%) 99.2% (98.9-99.3%) 0.866 Capreomycin drprg 62(235) 92(2449) 73.6% (67.6-78.8%) 96.2% (95.4-96.9%) 0.662 Capreomycin mykrobe 78(235) 85(2449) 66.8% (60.6-72.5%) 96.5% (95.7-97.2%) 0.625 Capreomycin tbprofiler 54(235) 96(2449) 77.0% (71.2-81.9%) 96.1% (95.2-96.8%) 0.679 Delamanid drprg 111(116) 1(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.188 Delamanid mykrobe 111(116) 1(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.188 Delamanid tbprofiler 111(116) 2(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173 Ethambutol drprg 146(1538) 736(4936) 90.5% (88.9-91.9%) 85.1% (84.1-86.1%) 0.685 Ethambutol mykrobe 149(1538) 728(4936) 90.3% (88.7-91.7%) 85.3% (84.2-86.2%) 0.686 Ethambutol tbprofiler 118(1538) 765(4936) 92.3% (90.9-93.6%) 84.5% (83.5-85.5%) 0.691 Ethionamide drprg 341(1104) 372(6105) 69.1% (66.3-71.8%) 93.9% (93.3-94.5%) 0.623 Ethionamide mykrobe 276(1104) 395(6105) 75.0% (72.4-77.5%) 93.5% (92.9-94.1%) 0.658 Ethionamide tbprofiler 272(1104) 414(6105) 75.4% (72.7-77.8%) 93.2% (92.6-93.8%) 0.653 Isoniazid drprg 362(3900) 164(4194) 90.7% (89.8-91.6%) 96.1% (95.5-96.6%) 0.871 Isoniazid mykrobe 366(3900) 163(4194) 90.6% (89.7-91.5%) 96.1% (95.5-96.7%) 0.87 Isoniazid tbprofiler 297(3900) 181(4194) 92.4% (91.5-93.2%) 95.7% (95.0-96.3%) 0.882 Kanamycin drprg 142(670) 101(6975) 78.8% (75.6-81.7%) 98.6% (98.2-98.8%) 0.796 Kanamycin mykrobe 166(670) 96(6975) 75.2% (71.8-78.3%) 98.6% (98.3-98.9%) 0.776 Kanamycin tbprofiler 122(670) 107(6975) 81.8% (78.7-84.5%) 98.5% (98.1-98.7%) 0.811 Levofloxacin drprg 105(1040) 97(5454) 89.9% (87.9-91.6%) 98.2% (97.8-98.5%) 0.884 Levofloxacin mykrobe 108(1040) 97(5454) 89.6% (87.6-91.3%) 98.2% (97.8-98.5%) 0.882 Levofloxacin tbprofiler 85(1040) 109(5454) 91.8% (90.0-93.3%) 98.0% (97.6-98.3%) 0.89 Linezolid drprg 49(65) 4(6110) 24.6% (15.8-36.3%) 99.9% (99.8-100.0%) 0.441 Linezolid mykrobe 49(65) 4(6110) 24.6% (15.8-36.3%) 99.9% (99.8-100.0%) 0.441 Linezolid tbprofiler 48(65) 5(6110) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.447 Moxifloxacin drprg 60(603) 464(5431) 90.0% (87.4-92.2%) 91.5% (90.7-92.2%) 0.656 Moxifloxacin mykrobe 59(603) 460(5431) 90.2% (87.6-92.3%) 91.5% (90.8-92.2%) 0.658 Moxifloxacin tbprofiler 42(603) 482(5431) 93.0% (90.7-94.8%) 91.1% (90.3-91.9%) 0.668 Ofloxacin drprg 31(105) 4(424) 70.5% (61.2-78.4%) 99.1% (97.6-99.6%) 0.782 Ofloxacin mykrobe 32(105) 4(424) 69.5% (60.2-77.5%) 99.1% (97.6-99.6%) 0.776 Ofloxacin tbprofiler 26(105) 6(424) 75.2% (66.2-82.5%) 98.6% (96.9-99.3%) 0.802 Pyrazinamide drprg 75(341) 47(822) 78.0% (73.3-82.1%) 94.3% (92.5-95.7%) 0.742 Pyrazinamide mykrobe 73(341) 45(822) 78.6% (73.9-82.6%) 94.5% (92.8-95.9%) 0.751 Pyrazinamide tbprofiler 45(341) 62(822) 86.8% (82.8-90.0%) 92.5% (90.4-94.1%) 0.782 Rifampicin drprg 142(3222) 166(4586) 95.6% (94.8-96.2%) 96.4% (95.8-96.9%) 0.919 Rifampicin mykrobe 187(3222) 165(4586) 94.2% (93.3-95.0%) 96.4% (95.8-96.9%) 0.907 Rifampicin tbprofiler 102(3222) 177(4586) 96.8% (96.2-97.4%) 96.1% (95.5-96.7%) 0.927 Streptomycin drprg 278(1042) 130(1205) 73.3% (70.6-75.9%) 89.2% (87.3-90.8%) 0.637 Streptomycin mykrobe 295(1042) 132(1205) 71.7% (68.9-74.3%) 89.0% (87.2-90.7%) 0.621 Streptomycin tbprofiler 257(1042) 136(1205) 75.3% (72.6-77.9%) 88.7% (86.8-90.4%) 0.649 Nanopore
[image: image] https://user-images.githubusercontent.com/20403931/203677879-e90cc0ce-d034-4cfb-a49f-85b72afca86b.png Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC Amikacin drprg 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869 Amikacin mykrobe 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869 Amikacin tbprofiler 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869 Capreomycin drprg 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02 Capreomycin mykrobe 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02 Capreomycin tbprofiler 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02 Ethambutol drprg 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42 Ethambutol mykrobe 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42 Ethambutol tbprofiler 5(14) 15(77) 64.3% (38.8-83.7%) 80.5% (70.3-87.8%) 0.367 Ethionamide drprg 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843 Ethionamide mykrobe 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843 Ethionamide tbprofiler 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843 Isoniazid drprg 9(51) 4(48) 82.4% (69.7-90.4%) 91.7% (80.4-96.7%) 0.742 Isoniazid mykrobe 9(51) 4(48) 82.4% (69.7-90.4%) 91.7% (80.4-96.7%) 0.742 Isoniazid tbprofiler 9(51) 3(48) 82.4% (69.7-90.4%) 93.8% (83.2-97.9%) 0.764 Kanamycin drprg 0(0) 1(52) - 98.1% (89.9-99.7%) - Kanamycin mykrobe 0(0) 1(52) - 98.1% (89.9-99.7%) - Kanamycin tbprofiler 0(0) 1(52) - 98.1% (89.9-99.7%) - Moxifloxacin drprg 0(0) 1(1) - 0.0% (0.0-79.3%) - Moxifloxacin mykrobe 0(0) 1(1) - 0.0% (0.0-79.3%) - Moxifloxacin tbprofiler 0(0) 1(1) - 0.0% (0.0-79.3%) - Ofloxacin drprg 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823 Ofloxacin mykrobe 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823 Ofloxacin tbprofiler 0(10) 3(77) 100.0% (72.2-100.0%) 96.1% (89.2-98.7%) 0.86 Pyrazinamide drprg 0(0) 0(1) - 100.0% (20.7-100.0%) - Pyrazinamide mykrobe 0(0) 0(1) - 100.0% (20.7-100.0%) - Pyrazinamide tbprofiler 0(0) 0(1) - 100.0% (20.7-100.0%) - Rifampicin drprg 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873 Rifampicin mykrobe 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873 Rifampicin tbprofiler 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873 Streptomycin drprg 2(8) 14(83) 75.0% (40.9-92.9%) 83.1% (73.7-89.7%) 0.398 Streptomycin mykrobe 2(8) 27(83) 75.0% (40.9-92.9%) 67.5% (56.8-76.6%) 0.25 Streptomycin tbprofiler 2(8) 12(83) 75.0% (40.9-92.9%) 85.5% (76.4-91.5%) 0.43
— Reply to this email directly, view it on GitHub https://github.com/mbhall88/drprg-paper/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6TKZC4MDZPDVRG56HLV7DWJ3ESVANCNFSM6AAAAAAQWLHP7A . You are receiving this because you commented.Message ID: @.***>
Yeah!
I disagree sadly haha. TBProfiler is beating us on a lot of drugs. Going to dig into why that is now
Looking again (now on laptop), if i was to summarise those results:
On illumina, tb-profiler often has the highest sensitivity. It does pay a very small price in specificity, but it's much less noticeable than the sensitivity increase. So i agree, good to look into that
On nanopore: sensitivity of all tools is essentially identical (except tb-profiler has a problem on EMB). Specificity is also essentially identical, although for two drugs (streptomycin and ofloxacin) tb-prof has a slightly increased specificity. I'm quite impressed/surprised how well all 3 do on the 4 drugs where any frameshift in a gene causes a resistant call. It matches what you found for Mykrobe in your Lancet Microbe paper @mbhall88 , but am delighted it's also true for DrPrg; also a bit surprised that tb-profiler does that well too given it uses bcftools. We didn't find we could call indels with this level of specificity. (I guess, just refusing to make indel calls with nanopore would give v high specificity?)
I've been looking through the variants where drprg is FN but either of the other tools is TP (on Illumina) to see what variants we have missed. (I'm not finished yet) but a lot of the tbprofiler TPs where we are FN are to do with minor alleles. By default, TBProfiler will call anything with a fraction of 0.1 or more. This brings up point 3 from https://github.com/mbhall88/drprg-paper/issues/2 again. We tell mykrobe to run in haploid mode and drprg only runs in haploid mode. The options forward I see are:
Option 2 is obviously the easiest and most likely to make us look better, but it sits somewhat uncomfortably with me as we are kind of skewing the results in our favour right?
I will keep working through these results next week for other drugs as there are also a few cases on weird indels which I will document when I have a better understanding of what's going on.
I think detecting minors could easily be done directly in drprg, no need to implement in Pandora. You get coverage info on the S and R alleles right? Just ask if the coverage on any R allele is >0.1 of the total
3. implement a diploid model in pandora (not sure how much work this would be? will alert Leandro to get his input too)
IDK neither how much work this would be, because the only experience I have with genotyping models actually is in pandora, which has a haploid model. If implementing a diploid model is simply calling the two most likely alleles, then maybe a simple implementation of getting the most likely allele (what is currently implemented) and the second most likely allele (remove/ignore the most likely and rerun the genotyping algorithm) is not hard. This can be easily generalised to n-ploid... but I don't think it is as simple as this...
I think detecting minors could easily be done directly in drprg, no need to implement in Pandora. You get coverage info on the S and R alleles right? Just ask if the coverage on any R allele is >0.1 of the total
True. I'll have to do some reimplementing though as I currently only pay attention to the called alleles. But it shouldn't take too long to get this working 🤞
Hurrah! I would reread the section on minor alleles here https://wellcomeopenresearch.org/articles/4-191 I just reread it and it was informative, reminded me of differences between drugs
See https://github.com/mbhall88/drprg/issues/19#issuecomment-1345391531 for the latest results after adding minor allele calling
After updating pandora and make_prg, as well as implementing gene deletion detection, we have the following Illumina results
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 68(485) | 57(6958) | 86.0% (82.6-88.8%) | 99.2% (98.9-99.4%) | 0.861 |
Amikacin | mykrobe | 93(485) | 51(6958) | 80.8% (77.1-84.1%) | 99.3% (99.0-99.4%) | 0.836 |
Amikacin | tbprofiler | 62(485) | 59(6958) | 87.2% (83.9-89.9%) | 99.2% (98.9-99.3%) | 0.866 |
Capreomycin | drprg | 57(235) | 95(2449) | 75.7% (69.9-80.8%) | 96.1% (95.3-96.8%) | 0.672 |
Capreomycin | mykrobe | 72(235) | 88(2449) | 69.4% (63.2-74.9%) | 96.4% (95.6-97.1%) | 0.638 |
Capreomycin | tbprofiler | 54(235) | 96(2449) | 77.0% (71.2-81.9%) | 96.1% (95.2-96.8%) | 0.679 |
Delamanid | drprg | 111(116) | 1(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.188 |
Delamanid | mykrobe | 111(116) | 2(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Delamanid | tbprofiler | 111(116) | 2(8152) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 122(1538) | 750(4936) | 92.1% (90.6-93.3%) | 84.8% (83.8-85.8%) | 0.693 |
Ethambutol | mykrobe | 133(1538) | 747(4936) | 91.4% (89.8-92.7%) | 84.9% (83.8-85.8%) | 0.689 |
Ethambutol | tbprofiler | 118(1538) | 765(4936) | 92.3% (90.9-93.6%) | 84.5% (83.5-85.5%) | 0.691 |
Ethionamide | drprg | 325(1104) | 395(6105) | 70.6% (67.8-73.2%) | 93.5% (92.9-94.1%) | 0.625 |
Ethionamide | mykrobe | 265(1104) | 413(6105) | 76.0% (73.4-78.4%) | 93.2% (92.6-93.8%) | 0.658 |
Ethionamide | tbprofiler | 272(1104) | 414(6105) | 75.4% (72.7-77.8%) | 93.2% (92.6-93.8%) | 0.653 |
Isoniazid | drprg | 307(3900) | 173(4194) | 92.1% (91.2-92.9%) | 95.9% (95.2-96.4%) | 0.882 |
Isoniazid | mykrobe | 333(3900) | 170(4194) | 91.5% (90.5-92.3%) | 95.9% (95.3-96.5%) | 0.876 |
Isoniazid | tbprofiler | 297(3900) | 181(4194) | 92.4% (91.5-93.2%) | 95.7% (95.0-96.3%) | 0.882 |
Kanamycin | drprg | 128(670) | 107(6975) | 80.9% (77.7-83.7%) | 98.5% (98.1-98.7%) | 0.805 |
Kanamycin | mykrobe | 152(670) | 98(6975) | 77.3% (74.0-80.3%) | 98.6% (98.3-98.8%) | 0.789 |
Kanamycin | tbprofiler | 122(670) | 107(6975) | 81.8% (78.7-84.5%) | 98.5% (98.1-98.7%) | 0.811 |
Levofloxacin | drprg | 81(1040) | 103(5454) | 92.2% (90.4-93.7%) | 98.1% (97.7-98.4%) | 0.896 |
Levofloxacin | mykrobe | 88(1040) | 102(5454) | 91.5% (89.7-93.1%) | 98.1% (97.7-98.5%) | 0.892 |
Levofloxacin | tbprofiler | 85(1040) | 109(5454) | 91.8% (90.0-93.3%) | 98.0% (97.6-98.3%) | 0.89 |
Linezolid | drprg | 49(65) | 4(6110) | 24.6% (15.8-36.3%) | 99.9% (99.8-100.0%) | 0.441 |
Linezolid | mykrobe | 48(65) | 4(6110) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | tbprofiler | 48(65) | 5(6110) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.447 |
Moxifloxacin | drprg | 41(603) | 480(5431) | 93.2% (90.9-94.9%) | 91.2% (90.4-91.9%) | 0.669 |
Moxifloxacin | mykrobe | 44(603) | 473(5431) | 92.7% (90.3-94.5%) | 91.3% (90.5-92.0%) | 0.669 |
Moxifloxacin | tbprofiler | 42(603) | 482(5431) | 93.0% (90.7-94.8%) | 91.1% (90.3-91.9%) | 0.668 |
Ofloxacin | drprg | 24(105) | 5(424) | 77.1% (68.2-84.1%) | 98.8% (97.3-99.5%) | 0.821 |
Ofloxacin | mykrobe | 26(105) | 5(424) | 75.2% (66.2-82.5%) | 98.8% (97.3-99.5%) | 0.808 |
Ofloxacin | tbprofiler | 26(105) | 6(424) | 75.2% (66.2-82.5%) | 98.6% (96.9-99.3%) | 0.802 |
Pyrazinamide | drprg | 70(341) | 54(822) | 79.5% (74.9-83.4%) | 93.4% (91.5-94.9%) | 0.74 |
Pyrazinamide | mykrobe | 55(341) | 56(822) | 83.9% (79.6-87.4%) | 93.2% (91.3-94.7%) | 0.77 |
Pyrazinamide | tbprofiler | 45(341) | 62(822) | 86.8% (82.8-90.0%) | 92.5% (90.4-94.1%) | 0.782 |
Rifampicin | drprg | 138(3222) | 167(4586) | 95.7% (95.0-96.4%) | 96.4% (95.8-96.9%) | 0.92 |
Rifampicin | mykrobe | 164(3222) | 169(4586) | 94.9% (94.1-95.6%) | 96.3% (95.7-96.8%) | 0.912 |
Rifampicin | tbprofiler | 102(3222) | 177(4586) | 96.8% (96.2-97.4%) | 96.1% (95.5-96.7%) | 0.927 |
Streptomycin | drprg | 265(1042) | 133(1205) | 74.6% (71.8-77.1%) | 89.0% (87.1-90.6%) | 0.645 |
Streptomycin | mykrobe | 282(1042) | 135(1205) | 72.9% (70.2-75.5%) | 88.8% (86.9-90.5%) | 0.629 |
Streptomycin | tbprofiler | 257(1042) | 136(1205) | 75.3% (72.6-77.9%) | 88.7% (86.8-90.4%) | 0.649 |
I do find the tbprofiler sensitivity suspicious, and suspect
Any drugs in particular?
Isoniazid and rifampicin really
I've been through all of the drprg PZA FNs that are called by at least one other tool.
There are two overarching problems drprg has
pncA 1 . GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATTCCT G,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATTCCT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCGGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATATCTT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCCGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATTCCT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACCTCTCCGGCACACCGGACTATTCCT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATATCTT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACGACTTCTCCGGCACACCGGACTATTCCT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCAGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACTACTTCTCCGGCACACCGGACTATTCCT,GTCATGTTCGCGATCGTCGCGGCGTCATGGACCCTATATCTGTGGCTGCCGCGTCGGTAGGCAAACTGCCCGGGCAGTCGCCCGAACGTATGGTGGACGTATGCGGGCGTTGATCATCGTCGACGTGCCGAACGACTTCTGCGAGGGTGGCTCGCTGGCGGTAACCGGTGGCGCCGCGCTGGCCCGCGCCATCAGCGACTACCTGGCCGAAGCGGCGGACTACCATCACGTCGTGGCAACCAAGGACTTCCACATCGACCCGGGTGACCACTTCTCCGGCACACCGGACTATTCCT . . VC=INDEL;GRAPHTYPE=SIMPLE GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:0,0,0,0,0,0,0,0,0,0:0,0,0,0,0,0,0,0,0,0:0,0,0,0,0,0,0,0,0,0:0,0,0,0,0,0,0,0,0,0:0,0,0,0,0,0,0,0,0,0:0,0,0,0,0,0,0,0,0,0:1,1,1,1,1,1,1,1,1,1:-488,-488,-488,-488,-488,-488,-488,-488,-488,-488:0
One way around this could be to notice when we have more than n consecutive VCF entries with a failed/null call and just call resistant? Or, to be more precise, notice when we have a failed position(s) that spans the start codon and then call resistant if it is one of the genes where gene deletion causes resistance.
These are all minor alleles where we don't call anything because we don't have anything in the graph to allow us to notice the minor alleles
Here are the mutations and the number of samples with FNs for those mutations
L430P: 2
Q432P: 2
D435V: 2
D435N: 1
H445Y: 1
H445N: 1
H445D: 3
S450L: 23
L452P: 5
The only way we can avoid this is adding some samples into the graph that contain all (most?) of these mutations.
Two solutions
This is v exciting and good news really, there a lot of sensitivity gain to be had from the minor alleles and gene deletions
For point 1, no, that is not right. We don't have these variants in the graph, which is the problem. (Remember our graph is not the panel, but the sparse popn. PRG from randomly sampled cryptic samples). And racon can't find them in these samples beacsue they're only minor alleles. Racon will find the major allele - the reference.
Point 2 seems like it effectively does away with the need for pandora though - is almost basically what tbprofiler does? It will also dramatically increase our runtime and memory usage, which at the moment is our biggest selling point really.
I need to think!
Follow up to 2. I'm not pushing for this solution, but just to say, we do this for covid, 30kb long, and use <500 mb ram and 45 seconds for the whole process. I think performance is not a barrier . But there are other,arguments not to do it
After closing mbhall88/drprg#23 the current (Illumina) results are
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 68(484) | 57(6958) | 86.0% (82.6-88.8%) | 99.2% (98.9-99.4%) | 0.86 |
Amikacin | mykrobe | 93(484) | 51(6958) | 80.8% (77.0-84.0%) | 99.3% (99.0-99.4%) | 0.835 |
Amikacin | tbprofiler | 62(484) | 59(6958) | 87.2% (83.9-89.9%) | 99.2% (98.9-99.3%) | 0.866 |
Capreomycin | drprg | 57(235) | 94(2448) | 75.7% (69.9-80.8%) | 96.2% (95.3-96.9%) | 0.673 |
Capreomycin | mykrobe | 72(235) | 87(2448) | 69.4% (63.2-74.9%) | 96.4% (95.6-97.1%) | 0.64 |
Capreomycin | tbprofiler | 54(235) | 95(2448) | 77.0% (71.2-81.9%) | 96.1% (95.3-96.8%) | 0.681 |
Delamanid | drprg | 111(116) | 5(8151) | 4.3% (1.9-9.7%) | 99.9% (99.9-100.0%) | 0.144 |
Delamanid | mykrobe | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Delamanid | tbprofiler | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 121(1537) | 752(4935) | 92.1% (90.7-93.4%) | 84.8% (83.7-85.7%) | 0.693 |
Ethambutol | mykrobe | 133(1537) | 747(4935) | 91.3% (89.8-92.7%) | 84.9% (83.8-85.8%) | 0.688 |
Ethambutol | tbprofiler | 118(1537) | 765(4935) | 92.3% (90.9-93.6%) | 84.5% (83.5-85.5%) | 0.691 |
Ethionamide | drprg | 273(1103) | 417(6105) | 75.2% (72.6-77.7%) | 93.2% (92.5-93.8%) | 0.651 |
Ethionamide | mykrobe | 265(1103) | 413(6105) | 76.0% (73.4-78.4%) | 93.2% (92.6-93.8%) | 0.658 |
Ethionamide | tbprofiler | 272(1103) | 414(6105) | 75.3% (72.7-77.8%) | 93.2% (92.6-93.8%) | 0.653 |
Isoniazid | drprg | 307(3899) | 173(4193) | 92.1% (91.2-92.9%) | 95.9% (95.2-96.4%) | 0.882 |
Isoniazid | mykrobe | 333(3899) | 170(4193) | 91.5% (90.5-92.3%) | 95.9% (95.3-96.5%) | 0.876 |
Isoniazid | tbprofiler | 297(3899) | 181(4193) | 92.4% (91.5-93.2%) | 95.7% (95.0-96.3%) | 0.882 |
Kanamycin | drprg | 128(669) | 107(6975) | 80.9% (77.7-83.7%) | 98.5% (98.1-98.7%) | 0.805 |
Kanamycin | mykrobe | 152(669) | 98(6975) | 77.3% (74.0-80.3%) | 98.6% (98.3-98.8%) | 0.788 |
Kanamycin | tbprofiler | 122(669) | 107(6975) | 81.8% (78.7-84.5%) | 98.5% (98.1-98.7%) | 0.811 |
Levofloxacin | drprg | 81(1040) | 102(5454) | 92.2% (90.4-93.7%) | 98.1% (97.7-98.5%) | 0.896 |
Levofloxacin | mykrobe | 88(1040) | 102(5454) | 91.5% (89.7-93.1%) | 98.1% (97.7-98.5%) | 0.892 |
Levofloxacin | tbprofiler | 85(1040) | 109(5454) | 91.8% (90.0-93.3%) | 98.0% (97.6-98.3%) | 0.89 |
Linezolid | drprg | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | mykrobe | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | tbprofiler | 48(65) | 5(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.447 |
Moxifloxacin | drprg | 41(603) | 478(5430) | 93.2% (90.9-94.9%) | 91.2% (90.4-91.9%) | 0.67 |
Moxifloxacin | mykrobe | 44(603) | 472(5430) | 92.7% (90.3-94.5%) | 91.3% (90.5-92.0%) | 0.669 |
Moxifloxacin | tbprofiler | 42(603) | 481(5430) | 93.0% (90.7-94.8%) | 91.1% (90.4-91.9%) | 0.668 |
Ofloxacin | drprg | 24(104) | 5(424) | 76.9% (68.0-84.0%) | 98.8% (97.3-99.5%) | 0.82 |
Ofloxacin | mykrobe | 26(104) | 5(424) | 75.0% (65.9-82.3%) | 98.8% (97.3-99.5%) | 0.807 |
Ofloxacin | tbprofiler | 26(104) | 6(424) | 75.0% (65.9-82.3%) | 98.6% (96.9-99.3%) | 0.8 |
Pyrazinamide | drprg | 68(341) | 53(820) | 80.1% (75.5-84.0%) | 93.5% (91.6-95.0%) | 0.746 |
Pyrazinamide | mykrobe | 55(341) | 56(820) | 83.9% (79.6-87.4%) | 93.2% (91.2-94.7%) | 0.77 |
Pyrazinamide | tbprofiler | 45(341) | 62(820) | 86.8% (82.8-90.0%) | 92.4% (90.4-94.1%) | 0.782 |
Rifampicin | drprg | 133(3221) | 166(4585) | 95.9% (95.1-96.5%) | 96.4% (95.8-96.9%) | 0.921 |
Rifampicin | mykrobe | 164(3221) | 169(4585) | 94.9% (94.1-95.6%) | 96.3% (95.7-96.8%) | 0.912 |
Rifampicin | tbprofiler | 102(3221) | 177(4585) | 96.8% (96.2-97.4%) | 96.1% (95.5-96.7%) | 0.927 |
Streptomycin | drprg | 266(1041) | 133(1205) | 74.4% (71.7-77.0%) | 89.0% (87.1-90.6%) | 0.644 |
Streptomycin | mykrobe | 282(1041) | 135(1205) | 72.9% (70.1-75.5%) | 88.8% (86.9-90.5%) | 0.629 |
Streptomycin | tbprofiler | 257(1041) | 136(1205) | 75.3% (72.6-77.8%) | 88.7% (86.8-90.4%) | 0.649 |
The nanopore results remain unchanged
After the updates in minor allele calling in https://github.com/mbhall88/drprg/issues/19#issuecomment-1371473290
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 68(484) | 57(6958) | 86.0% (82.6-88.8%) | 99.2% (98.9-99.4%) | 0.86 |
Amikacin | mykrobe | 93(484) | 51(6958) | 80.8% (77.0-84.0%) | 99.3% (99.0-99.4%) | 0.835 |
Amikacin | tbprofiler | 62(484) | 59(6958) | 87.2% (83.9-89.9%) | 99.2% (98.9-99.3%) | 0.866 |
Capreomycin | drprg | 57(235) | 94(2448) | 75.7% (69.9-80.8%) | 96.2% (95.3-96.9%) | 0.673 |
Capreomycin | mykrobe | 72(235) | 87(2448) | 69.4% (63.2-74.9%) | 96.4% (95.6-97.1%) | 0.64 |
Capreomycin | tbprofiler | 54(235) | 95(2448) | 77.0% (71.2-81.9%) | 96.1% (95.3-96.8%) | 0.681 |
Delamanid | drprg | 111(116) | 5(8151) | 4.3% (1.9-9.7%) | 99.9% (99.9-100.0%) | 0.144 |
Delamanid | mykrobe | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Delamanid | tbprofiler | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 121(1537) | 752(4935) | 92.1% (90.7-93.4%) | 84.8% (83.7-85.7%) | 0.693 |
Ethambutol | mykrobe | 133(1537) | 747(4935) | 91.3% (89.8-92.7%) | 84.9% (83.8-85.8%) | 0.688 |
Ethambutol | tbprofiler | 118(1537) | 765(4935) | 92.3% (90.9-93.6%) | 84.5% (83.5-85.5%) | 0.691 |
Ethionamide | drprg | 272(1103) | 420(6105) | 75.3% (72.7-77.8%) | 93.1% (92.5-93.7%) | 0.651 |
Ethionamide | mykrobe | 265(1103) | 413(6105) | 76.0% (73.4-78.4%) | 93.2% (92.6-93.8%) | 0.658 |
Ethionamide | tbprofiler | 272(1103) | 414(6105) | 75.3% (72.7-77.8%) | 93.2% (92.6-93.8%) | 0.653 |
Isoniazid | drprg | 307(3899) | 173(4193) | 92.1% (91.2-92.9%) | 95.9% (95.2-96.4%) | 0.882 |
Isoniazid | mykrobe | 333(3899) | 170(4193) | 91.5% (90.5-92.3%) | 95.9% (95.3-96.5%) | 0.876 |
Isoniazid | tbprofiler | 297(3899) | 181(4193) | 92.4% (91.5-93.2%) | 95.7% (95.0-96.3%) | 0.882 |
Kanamycin | drprg | 128(669) | 107(6975) | 80.9% (77.7-83.7%) | 98.5% (98.1-98.7%) | 0.805 |
Kanamycin | mykrobe | 152(669) | 98(6975) | 77.3% (74.0-80.3%) | 98.6% (98.3-98.8%) | 0.788 |
Kanamycin | tbprofiler | 122(669) | 107(6975) | 81.8% (78.7-84.5%) | 98.5% (98.1-98.7%) | 0.811 |
Levofloxacin | drprg | 79(1040) | 104(5454) | 92.4% (90.6-93.9%) | 98.1% (97.7-98.4%) | 0.896 |
Levofloxacin | mykrobe | 88(1040) | 102(5454) | 91.5% (89.7-93.1%) | 98.1% (97.7-98.5%) | 0.892 |
Levofloxacin | tbprofiler | 85(1040) | 109(5454) | 91.8% (90.0-93.3%) | 98.0% (97.6-98.3%) | 0.89 |
Linezolid | drprg | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | mykrobe | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | tbprofiler | 48(65) | 5(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.447 |
Moxifloxacin | drprg | 40(603) | 478(5430) | 93.4% (91.1-95.1%) | 91.2% (90.4-91.9%) | 0.671 |
Moxifloxacin | mykrobe | 44(603) | 472(5430) | 92.7% (90.3-94.5%) | 91.3% (90.5-92.0%) | 0.669 |
Moxifloxacin | tbprofiler | 42(603) | 481(5430) | 93.0% (90.7-94.8%) | 91.1% (90.4-91.9%) | 0.668 |
Ofloxacin | drprg | 24(104) | 5(424) | 76.9% (68.0-84.0%) | 98.8% (97.3-99.5%) | 0.82 |
Ofloxacin | mykrobe | 26(104) | 5(424) | 75.0% (65.9-82.3%) | 98.8% (97.3-99.5%) | 0.807 |
Ofloxacin | tbprofiler | 26(104) | 6(424) | 75.0% (65.9-82.3%) | 98.6% (96.9-99.3%) | 0.8 |
Pyrazinamide | drprg | 67(341) | 54(820) | 80.4% (75.8-84.2%) | 93.4% (91.5-94.9%) | 0.746 |
Pyrazinamide | mykrobe | 55(341) | 56(820) | 83.9% (79.6-87.4%) | 93.2% (91.2-94.7%) | 0.77 |
Pyrazinamide | tbprofiler | 45(341) | 62(820) | 86.8% (82.8-90.0%) | 92.4% (90.4-94.1%) | 0.782 |
Rifampicin | drprg | 114(3221) | 168(4585) | 96.5% (95.8-97.0%) | 96.3% (95.8-96.8%) | 0.926 |
Rifampicin | mykrobe | 164(3221) | 169(4585) | 94.9% (94.1-95.6%) | 96.3% (95.7-96.8%) | 0.912 |
Rifampicin | tbprofiler | 102(3221) | 177(4585) | 96.8% (96.2-97.4%) | 96.1% (95.5-96.7%) | 0.927 |
Streptomycin | drprg | 267(1041) | 134(1205) | 74.4% (71.6-76.9%) | 88.9% (87.0-90.5%) | 0.643 |
Streptomycin | mykrobe | 282(1041) | 135(1205) | 72.9% (70.1-75.5%) | 88.8% (86.9-90.5%) | 0.629 |
Streptomycin | tbprofiler | 257(1041) | 136(1205) | 75.3% (72.6-77.8%) | 88.7% (86.8-90.4%) | 0.649 |
OK, so looking at those results now, we can definitely see a sensitive improvement over Mykrobe with no precision loss. Compared with tbprofiler we are broadly the same - tbprofiler mostly has slightly better recall and slightly worse precision (except for fluoroquinolones). The biggest difference is 7% higher recall for tbprofiler for pyrazinamide . Fair summary?
Yep, fair summary. The work in mbhall88/drprg#24 should improve the PZA recall slightly too.
After the work in mbhall88/drprg#26 , we get the following Illumina results (nanopore is unchanged). Note: only ETO and PZA change from last results
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 66(484) | 57(6958) | 86.4% (83.0-89.1%) | 99.2% (98.9-99.4%) | 0.863 |
Amikacin | mykrobe | 93(484) | 51(6958) | 80.8% (77.0-84.0%) | 99.3% (99.0-99.4%) | 0.835 |
Amikacin | tbprofiler | 62(484) | 59(6958) | 87.2% (83.9-89.9%) | 99.2% (98.9-99.3%) | 0.866 |
Capreomycin | drprg | 56(235) | 94(2448) | 76.2% (70.3-81.2%) | 96.2% (95.3-96.9%) | 0.676 |
Capreomycin | mykrobe | 72(235) | 87(2448) | 69.4% (63.2-74.9%) | 96.4% (95.6-97.1%) | 0.64 |
Capreomycin | tbprofiler | 54(235) | 95(2448) | 77.0% (71.2-81.9%) | 96.1% (95.3-96.8%) | 0.681 |
Delamanid | drprg | 111(116) | 4(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.152 |
Delamanid | mykrobe | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Delamanid | tbprofiler | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 120(1537) | 754(4935) | 92.2% (90.7-93.4%) | 84.7% (83.7-85.7%) | 0.693 |
Ethambutol | mykrobe | 133(1537) | 747(4935) | 91.3% (89.8-92.7%) | 84.9% (83.8-85.8%) | 0.688 |
Ethambutol | tbprofiler | 118(1537) | 765(4935) | 92.3% (90.9-93.6%) | 84.5% (83.5-85.5%) | 0.691 |
Ethionamide | drprg | 245(1103) | 418(6105) | 77.8% (75.2-80.1%) | 93.2% (92.5-93.8%) | 0.669 |
Ethionamide | mykrobe | 265(1103) | 413(6105) | 76.0% (73.4-78.4%) | 93.2% (92.6-93.8%) | 0.658 |
Ethionamide | tbprofiler | 272(1103) | 414(6105) | 75.3% (72.7-77.8%) | 93.2% (92.6-93.8%) | 0.653 |
Isoniazid | drprg | 305(3899) | 173(4193) | 92.2% (91.3-93.0%) | 95.9% (95.2-96.4%) | 0.882 |
Isoniazid | mykrobe | 333(3899) | 170(4193) | 91.5% (90.5-92.3%) | 95.9% (95.3-96.5%) | 0.876 |
Isoniazid | tbprofiler | 297(3899) | 181(4193) | 92.4% (91.5-93.2%) | 95.7% (95.0-96.3%) | 0.882 |
Kanamycin | drprg | 126(669) | 107(6975) | 81.2% (78.0-83.9%) | 98.5% (98.1-98.7%) | 0.807 |
Kanamycin | mykrobe | 152(669) | 98(6975) | 77.3% (74.0-80.3%) | 98.6% (98.3-98.8%) | 0.788 |
Kanamycin | tbprofiler | 122(669) | 107(6975) | 81.8% (78.7-84.5%) | 98.5% (98.1-98.7%) | 0.811 |
Levofloxacin | drprg | 80(1040) | 106(5454) | 92.3% (90.5-93.8%) | 98.1% (97.7-98.4%) | 0.895 |
Levofloxacin | mykrobe | 88(1040) | 102(5454) | 91.5% (89.7-93.1%) | 98.1% (97.7-98.5%) | 0.892 |
Levofloxacin | tbprofiler | 85(1040) | 109(5454) | 91.8% (90.0-93.3%) | 98.0% (97.6-98.3%) | 0.89 |
Linezolid | drprg | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | mykrobe | 48(65) | 4(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.457 |
Linezolid | tbprofiler | 48(65) | 5(6109) | 26.2% (17.0-38.0%) | 99.9% (99.8-100.0%) | 0.447 |
Moxifloxacin | drprg | 39(603) | 477(5430) | 93.5% (91.3-95.2%) | 91.2% (90.4-91.9%) | 0.673 |
Moxifloxacin | mykrobe | 44(603) | 472(5430) | 92.7% (90.3-94.5%) | 91.3% (90.5-92.0%) | 0.669 |
Moxifloxacin | tbprofiler | 42(603) | 481(5430) | 93.0% (90.7-94.8%) | 91.1% (90.4-91.9%) | 0.668 |
Ofloxacin | drprg | 25(104) | 5(424) | 76.0% (66.9-83.2%) | 98.8% (97.3-99.5%) | 0.813 |
Ofloxacin | mykrobe | 26(104) | 5(424) | 75.0% (65.9-82.3%) | 98.8% (97.3-99.5%) | 0.807 |
Ofloxacin | tbprofiler | 26(104) | 6(424) | 75.0% (65.9-82.3%) | 98.6% (96.9-99.3%) | 0.8 |
Pyrazinamide | drprg | 57(341) | 55(820) | 83.3% (79.0-86.9%) | 93.3% (91.4-94.8%) | 0.767 |
Pyrazinamide | mykrobe | 55(341) | 56(820) | 83.9% (79.6-87.4%) | 93.2% (91.2-94.7%) | 0.77 |
Pyrazinamide | tbprofiler | 45(341) | 62(820) | 86.8% (82.8-90.0%) | 92.4% (90.4-94.1%) | 0.782 |
Rifampicin | drprg | 112(3221) | 168(4585) | 96.5% (95.8-97.1%) | 96.3% (95.8-96.8%) | 0.926 |
Rifampicin | mykrobe | 164(3221) | 169(4585) | 94.9% (94.1-95.6%) | 96.3% (95.7-96.8%) | 0.912 |
Rifampicin | tbprofiler | 102(3221) | 177(4585) | 96.8% (96.2-97.4%) | 96.1% (95.5-96.7%) | 0.927 |
Streptomycin | drprg | 259(1041) | 135(1205) | 75.1% (72.4-77.7%) | 88.8% (86.9-90.5%) | 0.648 |
Streptomycin | mykrobe | 282(1041) | 135(1205) | 72.9% (70.1-75.5%) | 88.8% (86.9-90.5%) | 0.629 |
Streptomycin | tbprofiler | 257(1041) | 136(1205) | 75.3% (72.6-77.8%) | 88.7% (86.8-90.4%) | 0.649 |
PZA still isn't great, but there are just so many different mutations with minor alleles that we don't have in the graph and hand-picking them all could lead to a complicated graph. Although I can try adding them if we really want to try boosting PZA sensitivity...
I think those results are much improved, am wondering what the pitch is for drprg though. Illumina is better than Mykrobe and ~same as tbprofiler. Are the nanopore results really unchanged from before ? Leandros mapping fixes will help too
am wondering what the pitch is for drprg though
Yeah, this has been troubling me too...I mean we can notice gene deletions...We use a lot less resources....
Are the nanopore results really unchanged from before ?
Here are the current nanopore results
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Amikacin | mykrobe | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Amikacin | tbprofiler | 0(11) | 3(78) | 100.0% (74.1-100.0%) | 96.2% (89.3-98.7%) | 0.869 |
Capreomycin | drprg | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Capreomycin | mykrobe | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Capreomycin | tbprofiler | 1(1) | 1(51) | 0.0% (0.0-79.3%) | 98.0% (89.7-99.7%) | -0.02 |
Ethambutol | drprg | 4(14) | 15(77) | 71.4% (45.4-88.3%) | 80.5% (70.3-87.8%) | 0.42 |
Ethambutol | mykrobe | 4(14) | 15(77) | 71.4% (45.4-88.3%) | 80.5% (70.3-87.8%) | 0.42 |
Ethambutol | tbprofiler | 5(14) | 15(77) | 64.3% (38.8-83.7%) | 80.5% (70.3-87.8%) | 0.367 |
Ethionamide | drprg | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Ethionamide | mykrobe | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Ethionamide | tbprofiler | 0(4) | 1(9) | 100.0% (51.0-100.0%) | 88.9% (56.5-98.0%) | 0.843 |
Isoniazid | drprg | 9(51) | 5(48) | 82.4% (69.7-90.4%) | 89.6% (77.8-95.5%) | 0.72 |
Isoniazid | mykrobe | 9(51) | 4(48) | 82.4% (69.7-90.4%) | 91.7% (80.4-96.7%) | 0.742 |
Isoniazid | tbprofiler | 9(51) | 3(48) | 82.4% (69.7-90.4%) | 93.8% (83.2-97.9%) | 0.764 |
Kanamycin | drprg | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Kanamycin | mykrobe | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Kanamycin | tbprofiler | 0(0) | 1(52) | - | 98.1% (89.9-99.7%) | - |
Moxifloxacin | drprg | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Moxifloxacin | mykrobe | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Moxifloxacin | tbprofiler | 0(0) | 1(1) | - | 0.0% (0.0-79.3%) | - |
Ofloxacin | drprg | 0(10) | 4(77) | 100.0% (72.2-100.0%) | 94.8% (87.4-98.0%) | 0.823 |
Ofloxacin | mykrobe | 0(10) | 4(77) | 100.0% (72.2-100.0%) | 94.8% (87.4-98.0%) | 0.823 |
Ofloxacin | tbprofiler | 0(10) | 3(77) | 100.0% (72.2-100.0%) | 96.1% (89.2-98.7%) | 0.86 |
Pyrazinamide | drprg | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Pyrazinamide | mykrobe | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Pyrazinamide | tbprofiler | 0(0) | 0(1) | - | 100.0% (20.7-100.0%) | - |
Rifampicin | drprg | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Rifampicin | mykrobe | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Rifampicin | tbprofiler | 5(48) | 1(44) | 89.6% (77.8-95.5%) | 97.7% (88.2-99.6%) | 0.873 |
Streptomycin | drprg | 2(8) | 14(83) | 75.0% (40.9-92.9%) | 83.1% (73.7-89.7%) | 0.398 |
Streptomycin | mykrobe | 2(8) | 27(83) | 75.0% (40.9-92.9%) | 67.5% (56.8-76.6%) | 0.25 |
Streptomycin | tbprofiler | 2(8) | 12(83) | 75.0% (40.9-92.9%) | 85.5% (76.4-91.5%) | 0.43 |
Sample sizes are so small it makes it hard to get a clear picture for a lot of drugs.
Here are the Illumina results on the full dataset (45,193 samples)
Drug | Tool | FN(R) | FP(S) | Sensitivity (95% CI) | Specificity (95% CI) | MCC |
---|---|---|---|---|---|---|
Amikacin | drprg | 270(1864) | 225(18732) | 85.5% (83.8-87.0%) | 98.8% (98.6-98.9%) | 0.852 |
Amikacin | mykrobe | 358(1864) | 195(18732) | 80.8% (78.9-82.5%) | 99.0% (98.8-99.1%) | 0.831 |
Amikacin | tbprofiler | 269(1864) | 227(18732) | 85.6% (83.9-87.1%) | 98.8% (98.6-98.9%) | 0.852 |
Capreomycin | drprg | 293(1298) | 300(13034) | 77.4% (75.1-79.6%) | 97.7% (97.4-97.9%) | 0.749 |
Capreomycin | mykrobe | 367(1298) | 265(13034) | 71.7% (69.2-74.1%) | 98.0% (97.7-98.2%) | 0.723 |
Capreomycin | tbprofiler | 292(1298) | 305(13034) | 77.5% (75.2-79.7%) | 97.7% (97.4-97.9%) | 0.748 |
Delamanid | drprg | 111(116) | 4(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.152 |
Delamanid | mykrobe | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Delamanid | tbprofiler | 111(116) | 2(8151) | 4.3% (1.9-9.7%) | 100.0% (99.9-100.0%) | 0.173 |
Ethambutol | drprg | 484(5706) | 2287(26863) | 91.5% (90.8-92.2%) | 91.5% (91.1-91.8%) | 0.749 |
Ethambutol | mykrobe | 499(5706) | 2265(26863) | 91.3% (90.5-92.0%) | 91.6% (91.2-91.9%) | 0.749 |
Ethambutol | tbprofiler | 471(5706) | 2290(26863) | 91.7% (91.0-92.4%) | 91.5% (91.1-91.8%) | 0.751 |
Ethionamide | drprg | 672(2853) | 992(11016) | 76.4% (74.9-78.0%) | 91.0% (90.4-91.5%) | 0.649 |
Ethionamide | mykrobe | 772(2853) | 960(11016) | 72.9% (71.3-74.5%) | 91.3% (90.7-91.8%) | 0.627 |
Ethionamide | tbprofiler | 787(2853) | 964(11016) | 72.4% (70.7-74.0%) | 91.2% (90.7-91.8%) | 0.623 |
Isoniazid | drprg | 1016(14531) | 593(25764) | 93.0% (92.6-93.4%) | 97.7% (97.5-97.9%) | 0.913 |
Isoniazid | mykrobe | 1054(14531) | 560(25764) | 92.7% (92.3-93.2%) | 97.8% (97.6-98.0%) | 0.913 |
Isoniazid | tbprofiler | 987(14531) | 648(25764) | 93.2% (92.8-93.6%) | 97.5% (97.3-97.7%) | 0.912 |
Kanamycin | drprg | 359(2205) | 316(17934) | 83.7% (82.1-85.2%) | 98.2% (98.0-98.4%) | 0.827 |
Kanamycin | mykrobe | 437(2205) | 300(17934) | 80.2% (78.5-81.8%) | 98.3% (98.1-98.5%) | 0.808 |
Kanamycin | tbprofiler | 349(2205) | 322(17934) | 84.2% (82.6-85.6%) | 98.2% (98.0-98.4%) | 0.828 |
Levofloxacin | drprg | 272(3102) | 355(14867) | 91.2% (90.2-92.2%) | 97.6% (97.4-97.8%) | 0.879 |
Levofloxacin | mykrobe | 299(3102) | 330(14867) | 90.4% (89.3-91.4%) | 97.8% (97.5-98.0%) | 0.878 |
Levofloxacin | tbprofiler | 276(3102) | 356(14867) | 91.1% (90.0-92.1%) | 97.6% (97.3-97.8%) | 0.878 |
Linezolid | drprg | 104(152) | 30(10911) | 31.6% (24.7-39.3%) | 99.7% (99.6-99.8%) | 0.436 |
Linezolid | mykrobe | 105(152) | 29(10911) | 30.9% (24.1-38.7%) | 99.7% (99.6-99.8%) | 0.432 |
Linezolid | tbprofiler | 104(152) | 31(10911) | 31.6% (24.7-39.3%) | 99.7% (99.6-99.8%) | 0.433 |
Moxifloxacin | drprg | 178(2255) | 1133(14696) | 92.1% (90.9-93.1%) | 92.3% (91.8-92.7%) | 0.732 |
Moxifloxacin | mykrobe | 207(2255) | 1113(14696) | 90.8% (89.6-91.9%) | 92.4% (92.0-92.8%) | 0.726 |
Moxifloxacin | tbprofiler | 182(2255) | 1141(14696) | 91.9% (90.7-93.0%) | 92.2% (91.8-92.7%) | 0.729 |
Ofloxacin | drprg | 166(778) | 68(6007) | 78.7% (75.6-81.4%) | 98.9% (98.6-99.1%) | 0.823 |
Ofloxacin | mykrobe | 147(778) | 62(6007) | 81.1% (78.2-83.7%) | 99.0% (98.7-99.2%) | 0.842 |
Ofloxacin | tbprofiler | 138(778) | 65(6007) | 82.3% (79.4-84.8%) | 98.9% (98.6-99.2%) | 0.848 |
Pyrazinamide | drprg | 786(3682) | 500(17748) | 78.7% (77.3-79.9%) | 97.2% (96.9-97.4%) | 0.783 |
Pyrazinamide | mykrobe | 776(3682) | 444(17748) | 78.9% (77.6-80.2%) | 97.5% (97.3-97.7%) | 0.794 |
Pyrazinamide | tbprofiler | 715(3682) | 502(17748) | 80.6% (79.3-81.8%) | 97.2% (96.9-97.4%) | 0.796 |
Rifampicin | drprg | 576(11766) | 593(28292) | 95.1% (94.7-95.5%) | 97.9% (97.7-98.1%) | 0.93 |
Rifampicin | mykrobe | 523(11766) | 604(28292) | 95.6% (95.2-95.9%) | 97.9% (97.7-98.0%) | 0.932 |
Rifampicin | tbprofiler | 370(11766) | 788(28292) | 96.9% (96.5-97.2%) | 97.2% (97.0-97.4%) | 0.931 |
Streptomycin | drprg | 784(5362) | 760(10179) | 85.4% (84.4-86.3%) | 92.5% (92.0-93.0%) | 0.78 |
Streptomycin | mykrobe | 903(5362) | 677(10179) | 83.2% (82.1-84.1%) | 93.3% (92.8-93.8%) | 0.773 |
Streptomycin | tbprofiler | 778(5362) | 662(10179) | 85.5% (84.5-86.4%) | 93.5% (93.0-94.0%) | 0.794 |
I am currently working through the INH FNs and have learned a lot and fixed some bugs. Most important result to understand here though will be the RIF sensitivity which is significantly lower than tb-profiler
I think I might have gotten to the bottom of the RIF sensitivity issue (also impacts a decent amount of INH FNs).
tl;dr we need a smaller minimum cluster size for (some) Illumina reads in pandora.
Cluster size dictates whether we recognise a read as "hitting" a locus. The default is 10. But I was finding a lot of FNs where we just have these big random stretches of zero depth - generally in and around the RRDR. When I map these reads to H37Rv with minimap2 it was showing that we should definitely have depth over the RRDR and it's surrounding regions. Turns out most of them are unmapped in the pandora SAM file. In the end, most of these reads were getting ~4-6 hits, therefore they were being marked as unmapped because they're below the default of 10. I have also noticed a lot of the samples with this issue are Illumina HiSeq 2000 75bp reads. This relates back to https://github.com/mbhall88/drprg/issues/12#issuecomment-1244890728.
I've run on a few samples with the minimum cluster size set to 4 and it seems to have resolved the issue for those samples. So I'm going to rerun all samples and reasssess the results after than 🤞
Also relates to long reads that overlap a prg only at the end .
This issue will document the rolling results.
The first sneak peak is all 437 Nanopore isolates and 400 Illumina isolates (selected at random).
Nanopore
Next avenues of investigation:
Positives:
Illumina
Next avenues of investigation:
I'll probably wait until the Nanopore stuff is debugged and then run on a larger sample of data before debugging Illumina