Closed jorainer closed 1 year ago
I should have more or less implemented the functionality above but I have a doubt. It seems to me that for the use cases above maybe it would not be necessary to split the @matches
data.frame
by $query_idx but it would be sufficient to apply FUN
to the whole @matches
. Is that right? But I suppose splitting would open up more possibilities?
Sorry, maybe I was not clear. So, the idea was to split the @matches
by $query_idx
and then loop over this list passing the whole @query
and @target
to the FUN
, but only the current subset of the matches
. This would allow to apply any function to the matching result of one query to either subset and filter the matches
and return only one match (e.g. the one with the highest score) or do also other things.
It would be helpful to have a function that iterates over a
Matched
object and allows to apply a user provied function to each. The functionmatchApply
should take aMatched
object as input and should return aMatched
object, with maybe changed or reduced@matches
slot. The idea would be that a user might want to e.g. restrict found matches for each query based on some custom, user provided, criteria.The definition of the function could be:
FUN
being a user defined function that must take input argumentsmatch, query, target, ...
(match
being adata.frame
with the@matches
for one query,query
the@query
slot,target
the@target
slot and...
optional additional arguments). The function must return adata.frame
with (at least) the same columns thanmatch
, but potentially different number of rows.matchApply
would basically split the@matches
data.frame
by$query_idx
andlapply
over this list applyingFUN
. The result would then berbind
ed again and replace the@matches
of theMatched
object.Maybe we could even be more flexible to not enforce returning a
Matched
object, but having e.g. a parameterreturnMatched
that, if set toFALSE
simply returns the result from thelapply
without further processing into aMatched
result object.One use case could be the following:
Given a
Matched
objectmtch
with results from amatchValues
function in which a more relaxed matching was performed (e.g. a largetolerance
): iterate over all matches and keep only those with a score (difference in m/z) smaller than a more strict value.Maybe a more reasonable use case could be: have a
MatchedSpectra
object with results from query against a full database. The user has a set of compounds for which he is sure that only these could be measured in the analysed sample. So, iterate over the matches of each query and keep only those against target spectra of a certain compound.Happy to discuss that @andreavicini if something is not clear.