Open wkumler opened 9 months ago
One big perk of this method is that it would identify/remove a bunch of the "noise" data points that are singular points instead of having to assign them each to an m/z group. min_group_size
already kinda does this but not very well(?)
Realized today that m/z group construction could be done with a 1D density-based clustering algorithm like DBSCAN or OPTICS. Perks of this would be that the "hard" m/z window currently used by
mz_group
would be relaxed and could be determined in a more data-driven method.There's a paper about this exact idea: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982975/ and they talk about reducing the computational constraints through some clever preprocessing, necessary because the current implementation takes a long while for just 6 files.
Quick proof-of-concept: