rformassspectrometry / MetaboAnnotation

High level functionality to support and simplify metabolomics data annotation.
https://rformassspectrometry.github.io/MetaboAnnotation/
12 stars 9 forks source link

Matching against several annotation resources #88

Open jorainer opened 1 year ago

jorainer commented 1 year ago

Nir (@AharoniLab) had the excellent idea that it might be good to allow matching against several reference databases in one go. The idea would be to allow calls like:

res <- matchSpectra(sps_exp, references, param = ...)

where references would be a set of reference databases against which the function would sequentially match the experimental spectra sps_exp. While an easy solution would be to use simply e.g. a list of Spectra objects as references, we thought it might be even better to introduce a further abstraction to simplify the use also for the user: instead of having to e.g. download a database or make a connection to a database, annotation source object should be used instead. These contain the information how to connect to the database and perform all the necessary steps (i.e. connect to the database or download the file, ...).

A further advantage is that this would allow to match against reference databases that are not fully open because the user will never get a Spectra object with all the reference data. Example: WeizMass: instead of needing a Spectra object with the full WeizMass data a WeizMass annotation source object is provided to matchSpectra and this object takes care of connecting to the database. As a result only matching data from WeizMass are provided, but not the full WeizMass library.

Thus, by introducing annotation source objects that don't contain or provide any data themself we could enable also matching against databases for which no full data access is possible and it could also simplify the use for the user (see example below).

Implementation notes

What I would propose is the following:

A call to match against WeizMass could then e.g. look like:

res <- matchSpectra(sps_exp, WeizMassSource(version = 2), param = CompareSpectraParam(ppm = 20))

Or against WeizMass and MassBank

res <- matchSpectra(
    sps_exp, 
    list(WeizMassSource(version = 2),
         MassBankSource("2022-03")),
    param = CompareSpectraParam(ppm = 20))

Input, comments etc highly welcome!

jorainer commented 1 year ago

A first version and initial classes are pushed to https://github.com/rformassspectrometry/MetaboAnnotation/commit/b13ddc5213ce458d8fe8cfd0f5c07dab5f96ea51 (annotation_source branch).

jorainer commented 1 year ago

Note: I've now made a first PR (#89 ) to add the basic concept of the annotation sources and examples to integrate MassBank. Development to integrate WeizMass continues in the weizmass_source branch.