rformassspectrometry / Spectra

Low level infrastructure to handle MS spectra
https://rformassspectrometry.github.io/Spectra/
34 stars 24 forks source link

Function to find/visualise contaminants #260

Open lgatto opened 1 year ago

lgatto commented 1 year ago

In PR #259, I included a message about mz being sorted. As pointed by @sgibb, we haven't done so in other instances, and I suppose it doesn't bring much anyway given that a vector logical is always returned, irrespective of the number of contained mz values.

I thought it would be useful to have another function (or update that one) to check for one or multiple contaminants in a Spectra object, something like this

Rplot

I imagine a function contaminantMz(x, mz = c(...)) that returns a logical matrix of dimensions length(x) by length(mz).

Referring back to this comment, in this case, I thought it would be worth mentioning that mz would be sorted if they weren't. But even here, if we name our colnames, it might not even be needed.

lgatto commented 1 year ago

Here's a two suggestions to implement what is needed for the above feature using containsMz():

Add a simplity argument

The (default) would be simplify = TRUE that would retain the current behaviour while simplify = FALSE would return a matrix of logicals of dimensions length(object) by length(mz) with colnames equal to the (sorted) as.character(mz) indicating whether the given mz is contained in the respective spectra.

Add a what argument

This new argument could take one of "any", "all" value returning respectively the default vector and the matrix of logicals described above. It could possible also take a "which" value to return the index of the contained (sorted) mz.

This would actually be very confusing with the current which argument, so I suggest to ignore this suggestion. Or we repurpuse which.

@sgibb @jorainer, what do you think?

jorainer commented 1 year ago

I would opt for adding a parameter simplify = FALSE to containsMz that does exactly what you propose.