rformassspectrometry / MetaboAnnotation

High level functionality to support and simplify metabolomics data annotation.
https://rformassspectrometry.github.io/MetaboAnnotation/
15 stars 9 forks source link

Extend filterMatches framework #81

Closed jorainer closed 2 years ago

jorainer commented 2 years ago

Currently filterMatches allows to filter a Matched object keeping only selected elements which are specified with the additional parameters. I would suggest the following extension:

This should allow us to perform also other types of filtering on a Matched object. Example: keep only the best match for query matching multiple targets. The param name could maybe be TopRankedMatchesParam. The algorithm could rank matches based on their score(s) (decreasing or increasingly) and keep only the first n (with n = 1 by default). For matches with rt and m/z we could rank m/z and rt separately and then perform the final ranking on the product of the m/z and rt ranks.

Any thoughts @michaelwitting @andreavicini ? Also open for suggestions of better param names.

andreavicini commented 2 years ago

I like the idea! (and could implement that if you are not already planning to do it)

jorainer commented 2 years ago

I wanted anyway to assign you to that issue @andreavicini ;)

But think also a bit about the naming of the parameter classes - maybe you could find better ones?

For the SelectedMatchesParam, I think it would be better to call that SelectMatchesParam (because the user selects the matches to filter manually). For the other TopRankedMatchesParam (because the user wants to get the top ranked match(es) for multi-matches). But I'm not super happy with these names.