lgatto / Spectrum

Spectrum Infrastructure for Mass Spectrometry Data
2 stars 1 forks source link

Suggestion - restrict modifying peaks #5

Open meowcat opened 5 years ago

meowcat commented 5 years ago

As we have discussed already, modifying mz and intensity can create issues with mapping spectral annotations. There are two options for mapping annotations - by peak indices or by m/z; the former will fail when subsetting (e.g. filtering) the peaks, the latter will fail when changing masses e.g. recalibrating.

We could get around this if we impose the following restrictions:

E.g.

sp <- Spectrum(mz=c(300,400,500), intensity=c(100,50,60))
sp$mz <- c(301,401,501) # OK
sp[c("mz", "intensity")] <- list(c(301,401), c(100, 50)) # forbidden
sp$peakIndex <- sp$peakIndex(c(1,2)) # will subset to the first two peaks
# this can also be used for reordering etc., 
# in any case this can trigger reindexing the annotations.

There is an issue with this, which is that I currently don't know how we could add peaks to the spectrum. This is an operation I definitely do. A solution would be to allow some kind of specBind or merge operation.

jorainer commented 5 years ago

I would opt for peak indices (or even peak IDs/names). Using numeric m/z to identify peaks can be problematic, because comparison of double data types depends on the precision of the system/representation of double.

To add peaks to a spectrum I would define a dedicated addPeaks method. And I am wondering if you would really allow $mz <- assignments, or if you would throw an error when that is called. While $mz <- and $intensity sounds nice to have, providing a dedicated API with methods addPeaks, filterPeaks etc might be easier for the user to understand what is going on - but that's just my opinion.

lgatto commented 5 years ago

Definitely indices, to avoid float comparisons.

As @jorainer says, an API becomes mandatory if we need to maintain object validity depending on operations.

Hence my suggestion to keep things as simple as possible now, and compute values on-the-fly. If we hit a bottleneck, we'll reconsider. Premature optimisation is the root of all evil. And so is feature creep.

meowcat commented 5 years ago

And I am wondering if you would really allow $mz <- assignments, or if you would throw an error when that is called.

My suggestion here was to allow $<- for modifying m/z or intensity (say, recalibrating) but not for adding or removing peaks.

feature creep

You are completely right. We should keep it simple.