Manuscript checklist - Githubissues

[x] Renumber figures
[x] Add histograms to clockplot
[x] Update figures with advice received from others
[x] Move Methods figures to supplement
[x] Add histograms for the parameters reported in the discussion (med_cor, med_SNR, peak_area (and height?), isotope metrics, pooled_RSD, xcms_SNR, mean_mz(?)) to supplement
[x] Renumber supplemental figures

[x] Add results about metric strengths (supplemental table? histograms/logistic curves?)
- [x] Figure out why the full logistic model likes mean_mz so much
- [x] Include Gini coefficient things from RF model
[x] Add confusion matrices to supplement (for full model, xcms model, min model, regularized models, and RF model)
[x] Add CV to features_extracted and rerun (should only affect XCMS and Full models, not raw data)
- [x] Figure out why we're getting all NAs for pred_prob in the Pttime dataset now
- [x] Check text for new values/results now that we have CV numbers (might be best XCMS-only param now?)
[x] Add results from GMM model?
[x] Decide whether the model is performing as in the Discussion (FDR/GPF ~ equal to CNNs) or in the Results (e.g. Fig 1 values)

[x] Add discussion about performance relative to other refs (esp. the CNNs)
- [x] Add the commentary about precision/recall vs FDR/GPF and how we chose the harder problem
[x] Add discussion about metric strengths
- [x] Discuss impact of filling in NA values and implications for logistic curves
- [x] Discuss how some metrics are one-sided (i.e. high Smp:Blk indicates good peak, low Smp:Blk means nothing because good peaks can still be present in the blank)
[x] Add note about planning to implement the peak shape and SNR metrics in XCMS by default

wkumler / MS_metrics