[x] Decide what to do with "Stans only" - keep in or drop?
[x] Homogenize use of MFs instead of "peaks"
[x] Homogenize "All" vs "full" and "raw" vs "two-param" vs "minimal" etc.
[x] Homogenize "Good" vs Good vs good
[x] Change "molecular feature" to "mass feature"
Figures
[x] Renumber figures
[x] Add histograms to clockplot
[x] Update figures with advice received from others
[x] Move Methods figures to supplement
[x] Add histograms for the parameters reported in the discussion (med_cor, med_SNR, peak_area (and height?), isotope metrics, pooled_RSD, xcms_SNR, mean_mz(?)) to supplement
[x] Renumber supplemental figures
Results
[x] Add results about metric strengths (supplemental table? histograms/logistic curves?)
[x] Figure out why the full logistic model likes mean_mz so much
[x] Include Gini coefficient things from RF model
[x] Add confusion matrices to supplement (for full model, xcms model, min model, regularized models, and RF model)
[x] Add CV to features_extracted and rerun (should only affect XCMS and Full models, not raw data)
[x] Figure out why we're getting all NAs for pred_prob in the Pttime dataset now
[x] Check text for new values/results now that we have CV numbers (might be best XCMS-only param now?)
[x] Add results from GMM model?
[x] Decide whether the model is performing as in the Discussion (FDR/GPF ~ equal to CNNs) or in the Results (e.g. Fig 1 values)
Discussion
[x] Add discussion about performance relative to other refs (esp. the CNNs)
[x] Add the commentary about precision/recall vs FDR/GPF and how we chose the harder problem
[x] Add discussion about metric strengths
[x] Discuss impact of filling in NA values and implications for logistic curves
[x] Discuss how some metrics are one-sided (i.e. high Smp:Blk indicates good peak, low Smp:Blk means nothing because good peaks can still be present in the blank)
[x] Add note about planning to implement the peak shape and SNR metrics in XCMS by default
General
Figures
Results
Discussion
References