zhengfj1994 / MetEx

MetEx is a tool to extract and annotate metabolites from liquid chromatography–mass spectrometry data.
Other
15 stars 3 forks source link

output: word_document: default html_document: default

MetEx

MetEx is a tool to extract and annotate metabolites from liquid chromatography–mass spectrometry data.

Introduction

Liquid chromatography–high resolution mass spectrometry (LC-HRMS) is the most popular platform for untargeted metabolomics methods, but annotating LC-HRMS data is a long-standing bottleneck that we are facing since years ago in metabolomics research. A wide variety of methods have been established to deal with the annotation issue. To date, however, there is a scarcity of efficient, systematic, and easy-to-handle tools that are tailored for metabolomics community. So we developed a user-friendly and powerful software/webserver, MetEx, to both enable implementation of classical peak detection-based annotation and a new annotation method based on targeted extraction algorithms. The new annotation method based on targeted extraction algorithms can annotate more than 2 times metabolites than classical peak detection-based annotation method because it reduces the loss of metabolite signal in the data preprocessing process. MetEx is freely available at http://www.metaboex.cn/MetEx and https://sourceforge.net/projects/metex/files/ (webserver and offline standalone program), the source code is available at https://github.com/zhengfj1994/MetEx.

Figure 1. The workflow of MetEx

R package, offline standalone software and web server.

MetEx provides three ways to obtain, namely:

  1. R package https://github.com/zhengfj1994/MetEx
  2. Offline standalone software https://sourceforge.net/projects/metex/files/
  3. web server http://www.metaboex.cn/MetEx

We first recommend using offline standalone program.

How to use the offline standalone software?

1. Download MetExApp.zip

Please download MetExApp.zip on SourceForge https://sourceforge.net/projects/metex/files/

Figure 2. Download MetExApp from SourceForge

2. Unzip MetExApp.zip.

Figure 3. Unzipped MetExApp

3. Check browser

Please ensure that there is a browser available on your computer, because the MetEx program will be opened in your default browser. We tested Google Chrome browser, Microsoft Edge browser, 360 speed browser, all of which can run MetEx.

4. Double-click MetEx.vbs

There is a file named MetEx.vbs in the unzipped folder. Double-click MetEx.vbs. Please wait patiently. The loading of the operating environment may take about 1 minute.

5. MetEx in browser

Then you can see the MetEx is opened in your default browser.

Figure 4. MetExApp opened in web browser

6. Overview of MetEx

You can see that there are some options to choose from on the left side of the MetEx page, namely Introduction, MetEx (Single file), MetEx (Mutilple), Classic annotation, Other software tools, Database download, Chromatographic systems, Help document, Update. If you want to annotate a mass spectrum file with MetEx, please select MetEx (Single file). If you want to annotate multiple mass spectrum files, and multiple mass spectrum files will eventually be merged into one table, please select MetEx (Mutilple). If you want to annotate compounds based on the results of peak matching, please select Classic annotation.

7. MetEx

Below we will introduce MetEx: Use MetEx to annotate a LC-MS file.

7.1 Parameters

On the page of MetEx, you can see that there are some parameters that need to be set. Their meanings and recommended values are as follows:

7.1.1 Database input
7.1.1.1 Database

​ A database file that meets MetEx's formatting requirements.

7.1.1.2 Ion mode

​ The ion mode used for annotation, 'positive' or 'negative'.

7.1.1.3 CE

​ Collision energy used for MS/MS acquisition, 'all', 'low', 'medium' or 'high'. Only when the low, medium and high collision energies of the database are 15, 30, 45 eV, the 'low', 'medium' and 'high' options can be used, otherwise, please use the 'all' option.

7.1.2 Retention time calibration
7.1.2.1 Whether to perform tR calibration

​ 'Yes' means the tR prediction will be preformed and you should provide an xlsx for tR prediction. 'No' means tR prediction will not be preformed.

7.1.2.2 tR of internal standards

​ a xlsx file, which contain the retention time of internal standards in database and experiment. It should be looked like Figure below.

Figure 5. A xlsx file containing retention times of internal standard which used for retention time calibration

7.1.3 LC-MS data import
7.1.3.1 mzXML file

​ The mzXML file which is transfered from raw LC-MS data (by using MSconvert in proteowizard).

7.1.3.2 mgf file

​ The mgf file which is transfered from raw LC-MS data (by using MSconvert in proteowizard).

7.1.4 Parameters of MetEx (MS1)
7.1.4.1 Delta m/z of MS1

​ The tolerance of MS1 between database and experiment (0.01 Da is recommended for Q-TOF and 0.005 Da is recommended for QE-HF).

7.1.4.2 Delta tR of MS1

​ the tolerance of retention time between database and experiment. The unit is seconds.

7.1.4.3 Entropy threshold

​ The information entropy threshold, 1.75 - 2 is recommended.

7.1.4.4 Intensity threshold

​ The peak height threshold. 600-270 is recommended for Q-TOF.

7.1.5 Parameters of MetEx (MS2)
7.1.5.1 Delta m/z of MS1 and MS2

​ The tolerance between MS1 and MS2 in experiment.

7.1.5.2 Delta m/z of MS2

​ The tolerance of MS2 between database and experiment.

7.1.5.3 Delta tR of MS1 and MS2

​ The tolerance of tR between MS1 and MS2

7.1.5.4 MS2 score threshold

​ The MS2 score threshold (0-1)

7.1.6 Other parameters
7.1.6.1 Number of cores for parallel computing

​ The number of CPU cores for parallel computing, it is depend on your computer's CPU and RAM. Users can refer to the following rules: The number of cores for parallel computing < The number of CPU cores of your computer & The number of cores for parallel computing × 4 GB < The RAM of you computer

7.1.6.2 show/hide Advance parameters
7.1.7 Advance parameters
7.1.7.1 Do you want to clean MS2?
7.1.7.2 MS2 noise removel threshold
7.1.7.3 Do you want to only keep the matched result with biggest score?
7.1.7.4 If you don't want to keep only one result for each feature, the above parameters should be selected as No and set a value of the min score which you want to keep.
7.1.7.5 Do you want to only keep matched result?

7.2 Run MetEx

Click the run button to start running.

Figure 6. Run MetEx

7.3 Download result

Figure 7. Download result

8. Classic annotation

Below we will introduce Classic annotation: Annotation from peak detection result.

8.1 Parameters

8.1.1 Database input

​ same with 7.1.1

8.1.2 Retention time calibration

​ same with 7.1.2

8.1.3 LC-MS data import
8.1.3.1 MS1 peak table (.csv file)

​ The peak detection result which is saved in csv file. We provided an example shown in (inst/extdata/peakTable). Two columns named 'mz' and 'rt' are necessary, other columns are not required.

8.1.3.2 mgf file path

​ same with 7.1.3.2

8.1.4 Parameters of MetEx (MS1)
8.1.4.1 Delta m/z of MS1

​ same with 7.1.4.1

8.1.4.2 Delta tR of MS1

​ It is slightly different from 7.1.4.2. 7.1.4.2 refers to the retention time range, and 9.1.4.2 refers to the retention time deviation, so 120 s in 7.1.4.2 and 60 s in 9.1.4.2 are equivalent.

8.1.5 Parameters of MetEx (MS2)

​ same with 7.1.5

8.1.6 Other parameters

​ same with 7.1.6

8.1.7 Advance parameters

​ same with 7.1.7

8.2 Run MetEx

Click the run button to start running.

8.3 Download result

How to use the web server?

1. Open MetEx web server

web server http://www.metaboex.cn/MetEx

Figure 8. Web server of MetEx

2. Overview of MetEx

You can see that web server of MetEx is same with the offline standalone software. The two are almost the same in use, so we won't repeat them here. Please note that the web server resources are limited. Please use offline standalone program as much as possible. If the running capacity of the server is exceeded, an error will be reported.

How to use the R package?

1. Installation

1.1 Install R

If you don't have R language, install R first.

>> R download here
Note: We developed MetEx in R 4.10.0. If you find problems when you use other versions, please contact us.
>> The old version of R

1.2 Install Rstudio

We recommended to install Rstudio owing to it is an integrated development environment (IDE) for R.
>> Rstudio download here

1.3 Install MetEx

Install the R package "devtools" and other reliable packages, then install MetEx using codes below. >>The devtools package

if(!require(devtools)){
    install.packages("devtools")
}
if(!require(BiocManager)){
    install.packages("BiocManager")
}
BiocManager::install("xcms")
devtools::install_github('zhengfj1994/MetEx')

It will take few minutes to download the packages.

1.4 Offline install

If the third step fails to install, users can download the project and install offline as shown below:

Figure 9. Download the MetEx package from github.

Figure 10. Download the MetEx package from github(2).

Then, in Rstudio, choose Packages —— Install:

Figure 11. Package intallation in Rstudio

Finally, choose install from Package Archive File (.zip; .tar.gz), and select the MetEx package, click install.

Figure 12. Choose the MetEx-master.zip and install.

1.5 Call MetEx

Call MetEx to see if the installation was successful.

library(MetEx)

2. Run shinyApp in Rstudio

Enter the following line of code:

shiny::runApp(system.file("extdata/shinyApp", package = "MetEx"))

Then you can run shinyApp in Rstudio.

Figure 13. Screen shot of Shiny App.

3. Functions and their parameters

The main functions and their parameters in MetEx

3.1 MetExAnnotation

Integration of the above functions, one line of code can complete the targeted extraction and annotation of metabolites.

3.2 ClassicalAnnotation

Annotation from peak table.

4. Examples

MetEx provide two approaches to annotate metabolites. The first approach is peak-detection-independent method and the second is peak-detection-dependent method. The first approach is newly developed and could avoid the peak loss in conventional peak detection methods.

Dependences

MetEx dependent the following packages, If you find that the installation fails and you are prompted that the following installation package is missing, please manually install the missing packages. stats, openxlsx, tcltk, doSNOW, stringr, xcms, do, XML, progress, shinydashboard, shinycssloaders, shinyjs, ggrepel, DT, dplyr, foreach, jsonlite, snow, tidyr, purrr, rlang, BiocManager, knitr, shiny, ggplot2, RColorBrewer

The uniform database format

Supported database

Retention time prediction

GNN-RT (https://github.com/Qiong-Yang/GNNRT) was used for retention time prediction in multiple chromatographic systems.

Maintainers

Fujian Zheng zhengfj@dicp.ac.cn or 2472700387@qq.com

Change Log

v1.0

The first version

v1.1

Fixed bugs

v1.2

Run faster

v1.3

Using Spectral entropy to calculate MS2 similarity

v1.4

Relaunch MetEx data processing options of multiple files.

Developing Plan