Closed mahafarhat closed 3 years ago
Hi @doctormo - I had shared the thresholds in Slack - do you want me to upload them here, too, for your easier access?
Yes please
Maha Sent from my phone
On Mar 26, 2021, at 8:59 AM, MGroschel @.***> wrote:
Hi @doctormo - I had shared the thresholds in Slack - do you want me to upload them here, too, for your easier access?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
GenTB-RF - equal or above the threshold means RESISTANT
0.082_ethambutol_threshold
0.22_isoniazid_threshold
0.001_pyrazinamide_threshold
0.002_rifampicin_threshold
0.6_amikacin_threshold
0.25_capreomycin_threshold
0.42_ciprofloxacin_threshold
0.32_ethionamide_threshold
0.63_kanamycin_threshold
0.41000000000000003_levofloxacin_threshold
0.33_ofloxacin_threshold
0.001_para-aminosalicylic_acid_threshold
0.047_streptomycin_threshold
WDNN - equal or below the threshold means RESISTANT
0.548_ethambutol_threshold
0.608_isoniazid_threshold
0.47600000000000003_pyrazinamide_threshold
0.59_rifampicin_threshold
0.45_amikacin_threshold
0.508_capreomycin_threshold
0.41600000000000004_kanamycin_threshold
0.452_ofloxacin_threshold
0.526_streptomycin_threshold
0.625_ciprofloxacin_threshold
@mgro actually please apply the threshold internally in the json generating step.
@doctormo we are planning to change the json output as follows
[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", "0.33", "4.95", "8.76" ],
to
[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", "0.33", "R", "4.95", "8.76" ],
where designation "R" is added because probability =0.33 > 0.22 (threshold for INH)
This format change is frighteningly inconsistent, I can make it work, but adding things in the middle can cause issues, especially if you remove items later in a new format. Don't forget these formats aren't versioned or schema'd so they can't be controlled in code without checking some boundary (like the length of the given list).
@doctormo thinking more about this, can we make this change instead:
[[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", "1", "0.33", "4.95", "8.76" ], ...]
Then you would display DR prediction = 1 DR probability = 0.33, FP = 4.95, FN = 8.76 upon hover ?
Alternatively we can change to
[[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", ["1", "0.33"], "4.95", "8.76" ], ...]
a third alternative id
[[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", "1", "4.95", "8.76" ], ...]
and include a new json list with the proabilities for each drug-strain?
if you are worried about back compatibility we can email the users saying we are launching a new version and hence have to purge old predictions
so we've settle on this 👍 [[ "/n/groups/gentb_www/predictData/tbdata_00000761/Italian1", "inh", "1", "4.95", "8.76", "0.33"], ...]
Hi @doctormo the *.matrix.json from the 2.2. and WDNN pipelines are now written in the format as agreed above, i.e., including the binary resistance call in the third position of each drug-list.
@mahafarhat could you do a last sanity check on the 2.2 pipeline - I think it'll make sense that we make the 2.2. the default at the moment where Martin changes the heatmap generation step to take in the new format (unless the heatmap script can handle both old and new drug-list formats?)
@mgro will supply @doctormo with threshold list for RF and WDNN separately. Probability will only be give upon hover, otherwise we will add a legend saying R= red, S= blue.