Code in this repository can help with processing untargeted metabolomics of e.g. plant secondary metabolites or E.coli metabolites. The workflow relies on existing open source tools including XCMSonline (for LCMS data), an [in-house macro]() from Overy et al. 2005 (for MALDI or DI-MS data), MassUp or MALDIquant (for MALDI data).
The code can prepare a peak intensity table which is suitable for undirected (PCA) and directed (OPLS-DA) analysis using Metabolanalyst, R or SIMCA-P+ (proprietary).
For now please cite this github repository if you use the code (https://github.com/LizzyParkerPannell/Untargeted_metabolomics_workflow).
A manuscript (currently under review) providing ovierview of the LCMS workflow is available at:
Parker, Ε.J.; Billane, K.C.; Austen, N.; Cotton, A.; George, R.M.; Hopkins, D.; Lake, J.A.; Pitman, J.K.; Prout, J.N.; Walker, H.J.; Williams, A.; Cameron, D.D. Untangling the Complexities of Processing and Analysis for Untargeted LC-MS Data Using Open-source Tools. Preprints 2023, 2023020056 (doi: 10.20944/preprints202302.0056.v1).
Elizabeth Parker1❋, Kathryn Billane2❋, James Pitman3, James Prout3, David Hopkins5, Alex Williams5, Heather Walker4, Rachel George4, Duncan Cameron6 .
❋ EP and KB are grateful to the University of Sheffield for “Unleash your data and software funding” that facilitated documentation of this workflow
With thanks to Harry Wright, Rachel George, Sophia van Mourik, Anne Cotton and Erika Hansson for their feedback.
A lot of the difficulties in analysis and/ or workflows come from the complexities of experimental structure. A lot of terms are used interchangeably in different contexts. Most tools for untargeted metabolomics are set up for 1 factor analysis with two or three levels e.g.
However, we quite often have more complex experimental designs when coming from other fields e.g.
Before you start, think about the following questions and make a note of what you’re expecting in terms of which groups of metabolite fingerprints could be similar and which could be different to each other. I don’t mean hypothesise but more, think logically about what you’re asking in your analysis and how your data will be grouped.