RECETOX / recetox-aplcms

This is a custom fork of apLCMS containing adaptations for large scale studies.
GNU General Public License v2.0
4 stars 8 forks source link

implement new way of clustering #198

Closed hechth closed 1 year ago

hechth commented 1 year ago

As becomes apparent from this history, the clustering step actually fails to group corresponding peaks together, even if assigned larger tolerances: https://umsa.cerit-sc.cz/u/hechth/h/20230412-rcx-aplcms-qc-new-parameters (see also the screenshot below).

The peaks simply get assigned to different clusters and therefore the number of peaks in the aligned peak table is significantly smaller than in the individual peak tables, where the peaks that are detected are quite comparable for comparable files. The clustering algorithm has to be revisited or a new one has to be implemented to determine which features are shared across samples.

image

hechth commented 1 year ago

Another option is to first refactor the compute_clusters function and to check how the clustering changes based on the parameterization. Also, the tolerance estimation should be moved more or less to own functions.