rformassspectrometry / RforMassSpectrometry.org

The R for Mass Spectrometry Initiative home page
http://www.rformassspectrometry.org/
10 stars 4 forks source link

Current status and future of MALDI-TOF related functions in the RforMassSpectrometry initiative #18

Closed cpauvert closed 9 months ago

cpauvert commented 9 months ago

Dear developers, I recently discovered your development initiative for Mass Spectrometry and I can only congratulate you for the amount of efforts put into these improvements!

I'm not an expert in MS data but I'm humbly developing a package that relies a lot on {MALDIquant} (thanks a lot @sgibb !) to process data generated by the MALDI Biotyper.

So far, I was under the impression that most of the functions in RforMassSpectrometry were oriented towards mass spectrometry assays, like proteomics and metabolomics experiments, and did not found much regarding MALDI-TOF related functions.

Best regards,

lgatto commented 9 months ago

Dear @cpauvert - thank you very much for your interest!

The reason MALDI is not prominent is simply because none of the main contributors (@sgibb, @jorainer myself, and others) have such data at the moment. We try to make sure that the infrastructure we develop remains general enough.

It would be interesting to highlight what we have is applicable to MALDI data, and identify what is missing and required for MALDI.

cpauvert commented 9 months ago

Thanks @lgatto for your quick reply, I'll try to find some time to have a look at what my package could do using the packages from the initiative and/or where would it break.

lgatto commented 9 months ago

And feel free to re-open this issue and make suggestions or discuss some contributions.

sgibb commented 9 months ago

@cpauvert thanks for the kind words! As @lgatto pointed out we don't have/work with these kind of data.

The RforMassSpectrometry infrastructure offers many advantages over MALDIquant (e.g. data on disk processing). Many MALDIquant functions are already part of MsCoreUtils or Spectra (e.g. smoothing, peak picking, binning, combining spectra). IMHO it would be not too hard to replace MALDIquant with Spectra. We just have to implement (copy and adapt) the baseline estimation/removal and the alignment functions. Unfortunately I won't have enough time to do it myself but I am happy to review pull requests!

(Currently there is no backend for Bruker's fid files but the readBrukerFlexData package is hard to maintain because Bruker doesn't offer any documentation and there are a lot of edge cases. So I would prefer to not implement a Bruker backend but use the mzML backend instead. So you would have to use the compass tools to convert the fid files into mzML.)

lgatto commented 9 months ago

IMHO it would be not too hard to replace MALDIquant with Spectra. We just have to implement (copy and adapt) the baseline estimation/removal and the alignment functions.

A first step here would be to move (with possibly small adaptations) these to MsCoreUtils. @sgibb - if you have time, you could simply list the function that could be moved in an MsCoreUtils issue, that we could tag with help needed. This could be a first simple step in the direction discussed above.

cpauvert commented 9 months ago

The RforMassSpectrometry infrastructure offers many advantages over MALDIquant (e.g. data on disk processing). Many MALDIquant functions are already part of MsCoreUtils or Spectra (e.g. smoothing, peak picking, binning, combining spectra). IMHO it would be not too hard to replace MALDIquant with Spectra. We just have to implement (copy and adapt) the baseline estimation/removal and the alignment functions. Unfortunately I won't have enough time to do it myself but I am happy to review pull requests!

Ok, thanks a lot Sebastian for this insider view that means a lot. I can't guarantee good implementation, but I'll see what I can do with the time I have.

(Currently there is no backend for Bruker's fid files but the readBrukerFlexData package is hard to maintain because Bruker doesn't offer any documentation and there are a lot of edge cases. So I would prefer to not implement a Bruker backend but use the mzML backend instead. So you would have to use the compass tools to convert the fid files into mzML.)

Thanks for the pointers, I wanted to save raw spectra as mzML in 'MALDIquant' but was losing a lot of metadata, so decided against. Doing the conversion in Compass sounds like a better option, thanks, I'll look into it!

At the end of the day, it's always about maintaining resources, but if there's also a risk for readBrukerFlexData to be deprecated at some point (even if 11years is already so great!), it might be worth investing time to ensure that the scientific software we produce can still be used later. Therefore, I'll probably have a branch of my package to see how a rewrite with the open source RforMassSpectrometry initiative works : )