Closed philouail closed 8 months ago
To me this function is great and also the filterSimilarities
- that I would rather suggest to call filterValues
would make a lot of sense to me. @lgatto what's your opinion on that?
We have already a use case in which we want to extract/filter spectra with their precursor m/z and retention time within respective ranges (from MS1 data).
Thanks @jorainer ! I have adapted the code to your suggestions.
Thanks for the review @lgatto ! Regarding the name, if I understand you you would like to name the function to something like filterSpectraDataRanges
? Or could you provide a possible example for the name?
I would not use spectraData
in the name of the function - I would understand that I would filter/subset the spectraData
(i.e. remove certain columns of spectraData
) instead of filtering the Spectra
.
To me filterRanges
would make sense, because we want to filter the Spectra
using values in one or more spectraVariables
(hence the name of that parameter being spectraVariables
). Also, for the other method we would use filterValues
. And yes, the idea was in fact that the method should be similar to e.g. filterPrecursorMzRange
or filterPrecursorMzValue
, so the user would find/understand the method right away. In facto filterPrecursorMzRange
would be the same as calling filterRanges(, spectraVariable = "precursorMz", ...)
.
We also thought of a generic filterSpectra
method and a RangeFilter
parameter - that would be another possibility, but then we would add a new S4 object (RangeFilter
) and hence increase the complexity of Spectra
...
OK for filterRanges()
, you convinced me. Please make sure in the documentation that filterPrecursorMzRange()
and filterRanges(, spectraVariable = "precursorMz", ...)
are equivalent.
Thanks @lgatto for the feedback ! I will adjust the documentation and the code accordingly. As well as adding the filterValues()
function and a explanation in the vignette. I will try to make it as clear as possible how they can be used :) Putting it in draft for now while i prepare it !
So here is everything updated. I moved the main code to the MsBackend.R to have a similar implementation as the other filtering functions.
Some things i am not 100% sure about:
tolerance
be by default Inf
? In mclosest()
it is but filterPrecursorMzValue()
is set as 0. values
name ? there is already a value
one, just want to be sure it's not confusing.Also curiosity question but what is the different between filterPrecrusorMz
, filterPrecursorMzValue
, and filterPrecursorMzRange
.
I will write the vignette now while i wait for feedback :) Thanks again for the help !
Regarding your last question: filterPrecursorMz
and filterPrecursorMzValues
are the same (i.e. match/filter based on similarity of the precursor m/z). Both (should?) support to match against multiple target m/z values (e.g. filterPrecursorMzValues(sps, mz = c(123.3, 234.2))
to keep spectra with a precursor m/z that matches any of the provided values). The filterPrecursorMzRange
allows to define a broader range (i.e. precursor m/z within a specified range).
For the default of tolerance
. For filterValues
I would use 0
as default. also for other methods (filterPrecursorMzValues
) we have 0
. The mclosest
(and closest
) are a bit different than these filter functions: from a filter function I would expect to subset the data based on a (stringent) condition. From closest
mclosest
I would just like to get the most similar hits. The tolerance
there is essentially a way to define that something is too different to be considered close. Hope this does make sense?
Thanks @jorainer for the idea, I'll update the code for the comments you made and implement the extra parameter !
Thank you @philouail for the great contribution!
Hi, I wanted to suggest the implementation of this function
filterRanges
, which allows the user to filter aspectra
object based on any variables of itsspectraData
suing a range of values that the user inputs . I believe this would be great for multiple reasons:Welcoming any feedback on the code and PR :)
On the other hand, I would also like to suggest implementing a similar function such as
filterSimilarities
(not sure about the name), which would be implemented in a similar way for filtering based on similarity to specific values for variables within thespectraData
of the object rather than within a range. Thus, it would have the same advantages as thefilterRanges
function. It would prevent variable-specific functions and make the code easier to combine multiple filters.I can however see the counter argument that this might not be as user-friendly as variable-specific functions. I would love to get your opinion @jorainer and @lgatto on this. If you are interested i can come up with a more detailed proposition of implementation.