Closed mbhall88 closed 1 year ago
This seems to have worked reasonably well
Tool | Drug | ΔFN | ΔFP |
---|---|---|---|
drprg | Amikacin | 0 | 0 |
drprg | Capreomycin | 0 | -1 |
drprg | Delamanid | 0 | 5 |
drprg | Ethambutol | -1 | 4 |
drprg | Ethionamide | -52 | 22 |
drprg | Isoniazid | 0 | 0 |
drprg | Kanamycin | 0 | 0 |
drprg | Levofloxacin | 0 | 0 |
drprg | Linezolid | -1 | 0 |
drprg | Moxifloxacin | 0 | -1 |
drprg | Ofloxacin | 0 | 0 |
drprg | Pyrazinamide | -2 | -1 |
drprg | Rifampicin | -5 | 0 |
drprg | Streptomycin | 0 | 0 |
Quite a few ETO FPs though, which I will take a look at before moving on
So all of those "new" ETO FPs are strong calls for a mutation I recently added (fabG1 L203L) which mykrobe and tb-profiler also call.
This sounds great!
The delamanid FPs are caused by us calling minor alleles for ddn L49P. One of these is backed up by the other callers, one seems like a decent minor call, but not made by the other callers, and the rest are very low depth. This has made me realise I am not applying the same variant filters to the minor allele calls as I do the normal major allele calls. For instance, some of these delamanid minor calls only have depth 2x on the minor allele, so these should be filtered out
So, stepping back, what are the rules we want to apply to call a minor? Above x% of the total reads at that variant (either allele) and above some min absolute number, right? I'd think no other filters?
The extra EMB FPs are also the same reasons as the delamanid ones.
I think we can just use the same filters for minors as we do for majors.
We also have an FRS for majors, which we obviously don't want for minors
The model behind Gt conf is only meaningful for majors, so I'd bin it for minors
Adding in the filtering of minors gives this diff to the results above
Tool | Drug | ΔFN | ΔFP |
---|---|---|---|
drprg | Amikacin | 0 | 0 |
drprg | Capreomycin | 0 | 0 |
drprg | Delamanid | 0 | -1 |
drprg | Ethambutol | 0 | -2 |
drprg | Ethionamide | 0 | 0 |
drprg | Isoniazid | 0 | 0 |
drprg | Kanamycin | 0 | 0 |
drprg | Levofloxacin | 0 | -1 |
drprg | Linezolid | 0 | 0 |
drprg | Moxifloxacin | 0 | -1 |
drprg | Ofloxacin | 0 | 0 |
drprg | Pyrazinamide | 0 | 0 |
drprg | Rifampicin | 0 | -1 |
drprg | Streptomycin | 1 | 0 |
So the diff for this overarching issue is
Tool | Drug | ΔFN | ΔFP |
---|---|---|---|
drprg | Amikacin | 0 | 0 |
drprg | Capreomycin | 0 | -1 |
drprg | Delamanid | 0 | 4 |
drprg | Ethambutol | -1 | 2 |
drprg | Ethionamide | -52 | 22 |
drprg | Isoniazid | 0 | 0 |
drprg | Kanamycin | 0 | 0 |
drprg | Levofloxacin | 0 | -1 |
drprg | Linezolid | -1 | 0 |
drprg | Moxifloxacin | 0 | -2 |
drprg | Ofloxacin | 0 | 0 |
drprg | Pyrazinamide | -2 | -1 |
drprg | Rifampicin | -5 | -1 |
drprg | Streptomycin | 1 | 0 |
See https://github.com/mbhall88/drprg-paper/issues/2 and https://github.com/mbhall88/drprg-paper/issues/2 for the motivation.
There are some common mutations which do not exist in the reference graph. As such, when there is a minor allele in a sample for one of these mutations, we do not detect it. This is because
discover
only finds major alleles, and to detect minor alleles, the allele must exist in the graph in the first place.