The peaks simply get assigned to different clusters and therefore the number of peaks in the aligned peak table is significantly smaller than in the individual peak tables, where the peaks that are detected are quite comparable for comparable files. The clustering algorithm has to be revisited or a new one has to be implemented to determine which features are shared across samples.
Another option is to first refactor the compute_clusters function and to check how the clustering changes based on the parameterization. Also, the tolerance estimation should be moved more or less to own functions.
As becomes apparent from this history, the clustering step actually fails to group corresponding peaks together, even if assigned larger tolerances: https://umsa.cerit-sc.cz/u/hechth/h/20230412-rcx-aplcms-qc-new-parameters (see also the screenshot below).
The peaks simply get assigned to different clusters and therefore the number of peaks in the aligned peak table is significantly smaller than in the individual peak tables, where the peaks that are detected are quite comparable for comparable files. The clustering algorithm has to be revisited or a new one has to be implemented to determine which features are shared across samples.