sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
177 stars 81 forks source link

Is add=TRUE parameter still available for findChromPeaks? #618

Closed eterlova closed 2 years ago

eterlova commented 2 years ago

Hello all,

I want to perform peak picking on both MS1 and MS2, and keep both in the same XCMSnExp object.

I think that at some point it was possible to run findChromPeaks(... msLevel=2L, add=TRUE), is that not an option anymore?

I get an error: Error in .local(object, param, ...) : unused argument (add = TRUE)

If you deprecated this option, is there another one that could perform the same function?

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /home/FCAM/eterlova/miniconda3/envs/Rmetab/lib/libopenblasp-r0.3.17.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] magrittr_2.0.2              data.table_1.14.2          
 [3] xcms_3.16.1                 MSnbase_2.20.4             
 [5] ProtGenerics_1.26.0         mzR_2.28.0                 
 [7] Rcpp_1.0.8                  BiocParallel_1.28.3        
 [9] SummarizedExperiment_1.24.0 Biobase_2.54.0             
[11] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1        
[13] IRanges_2.28.0              S4Vectors_0.32.3           
[15] BiocGenerics_0.40.0         MatrixGenerics_1.6.0       
[17] matrixStats_0.61.0         

loaded via a namespace (and not attached):
 [1] vsn_3.62.0             foreach_1.5.2          assertthat_0.2.1      
 [4] BiocManager_1.30.16    affy_1.72.0            GenomeInfoDbData_1.2.7
 [7] robustbase_0.93-9      impute_1.68.0          pillar_1.7.0          
[10] lattice_0.20-45        glue_1.6.2             limma_3.50.1          
[13] digest_0.6.29          RColorBrewer_1.1-2     XVector_0.34.0        
[16] colorspace_2.0-3       preprocessCore_1.56.0  Matrix_1.4-0          
[19] plyr_1.8.6             MALDIquant_1.21        XML_3.99-0.9          
[22] pkgconfig_2.0.3        zlibbioc_1.40.0        purrr_0.3.4           
[25] scales_1.1.1           RANN_2.6.1             affyio_1.64.0         
[28] tibble_3.1.6           generics_0.1.2         ggplot2_3.3.5         
[31] ellipsis_0.3.2         cli_3.3.0              MassSpecWavelet_1.60.0
[34] crayon_1.5.0           ncdf4_1.19             fansi_1.0.2           
[37] doParallel_1.0.17      MASS_7.3-55            MsFeatures_1.2.0      
[40] tools_4.1.0            lifecycle_1.0.1        munsell_0.5.0         
[43] cluster_2.1.2          DelayedArray_0.20.0    pcaMethods_1.86.0     
[46] compiler_4.1.0         mzID_1.32.0            rlang_1.0.2           
[49] grid_4.1.0             RCurl_1.98-1.6         iterators_1.0.14      
[52] MsCoreUtils_1.7.4      bitops_1.0-7           gtable_0.3.0          
[55] codetools_0.2-18       DBI_1.1.2              R6_2.5.1              
[58] dplyr_1.0.8            utf8_1.2.2             clue_0.3-60           
[61] parallel_4.1.0         vctrs_0.3.8            DEoptimR_1.0-10       
[64] tidyselect_1.1.2      
jorainer commented 2 years ago

This puzzles me - the parameter add was added to findChromPeaks,XCMSnExp in version 3.9.2 and has not been removed. On what object are you calling findChromPeaks with the parameter add = TRUE? Is it an OnDiskMSnExp (the object returned by readMSData) or an XCMSnExp object? Note that for the former no such parameter is defined. So, the first peak detection (presumably on MS1) should be performed without add = TRUE, but a subsequent findChromPeaks on the result object from the first peak detection supports add = TRUE.

eterlova commented 2 years ago

@jorainer, thanks for the clarification -- I couldn't quite figure out which of these objects to use from the function description, so I tried both. The reason why I thought it was not working is that after these steps:

xdata <- findChromPeaks(rawdata_cent, param = cw_param) 
xdata <- findChromPeaks(xdata, param = CentWaveParam(), msLevel = 2L, add=TRUE)
xdata_pp <- refineChromPeaks(xdata, mpp)
xdata_grouped1 <- groupChromPeaks(xdata_pp, param = PeakDensityParam()) 
xdata_rtaligned <- adjustRtime(xdata_grouped1, param = ObiwarpParam())
xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam())
xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, add = TRUE, msLevel = 2L)

I get an error:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘groupChromPeaks’ for signature ‘"XCMSnExp", "missing"’
Calls: groupChromPeaks -> <Anonymous>

Which I think it means that groupChromPeaks does not find the peaks to group? Any idea why my peaks are getting lost?

Best, Lisa

jorainer commented 2 years ago

The reason for the error message is that you called groupChromPeaks without specifying the parameter object. It should work if you use

xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam(), add = TRUE, msLevel = 2L)

instead of the last line you used above

eterlova commented 2 years ago

@jorainer Thanks for spotting this, it was not intentional, but unfortunately even with that I end up with an error

xdata_rtaligned <- adjustRtime(xdata, param = ObiwarpParam())
xdata_grouped <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam())
xdata_grouped <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam(), add = TRUE, msLevel = 2L)

From the .out file:

Apply retention time correction performed on MS1 to spectra from all MS levels
Applying retention time adjustment to the identified chromatographic peaks ... OK
Processing 253328 mz slices ... OK
Processing 259646 mz slices ... OK
Error in .local(object, param, ...) : 
  No feature definitions for MS level 1 present. Please run 'groupChromPeaks' first.
Calls: fillChromPeaks -> fillChromPeaks -> .local
Execution halted
jorainer commented 2 years ago

Do you know in which of the two commands the error is thrown?

Can you please run the commands separately to see which call throws the error? Also, please check after each if features are presents, i.e. call hasFeatures(xdata_rtaligned)? Maybe even seprately for each MS level, i.e. hasFeatures(xdata_rtaligned, msLevel = 1) and hasFeatures(xdata_rtaligned, msLevel = 2)?

eterlova commented 2 years ago

@jorainer I was not able to reproduce the same error when I run the script line by line, but here is what I get:

> xdata <- findChromPeaks(rawdata_cent, param =CentWaveParam())
Error in x$.self$finalize() : attempt to apply non-function
Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 45697 regions of interest ... OK: 4039 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 83042 regions of interest ... OK: 7048 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 85141 regions of interest ... OK: 7593 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 65843 regions of interest ... OK: 9027 found.

> hasFeatures(xdata, msLevel = 1)
[1] FALSE

> xdata <- findChromPeaks(xdata, param = CentWaveParam(), msLevel = 2L, add=TRUE)
Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 20523 regions of interest ... OK: 2778 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 38996 regions of interest ... OK: 4541 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 33806 regions of interest ... OK: 4535 found.

Detecting mass traces at 32 ppm ... OK
Detecting chromatographic peaks in 35617 regions of interest ... OK: 5206 found.

> hasFeatures(xdata, msLevel = 1)
[1] FALSE
> hasFeatures(xdata, msLevel = 2)
[1] FALSE

> xdata_pp <- refineChromPeaks(xdata, param=MergeNeighboringPeaksParam())
Evaluating 581 peaks in file DA_metabolomics2021_sample_127.mzML for merging ... OK

Evaluating 1147 peaks in file DA_metabolomics2021_sample_143.mzML for merging ... OK

Evaluating 1088 peaks in file DA_metabolomics2021_sample_111.mzML for merging ... OK

Evaluating 1497 peaks in file DA_metabolomics2021_sample_050.mzML for merging ... OK

Merging reduced 44767 chromPeaks to 44452.

> hasFeatures(xdata_pp, msLevel = 1)
[1] FALSE
> hasFeatures(xdata_pp, msLevel = 2)
[1] FALSE

> xdata_grouped1 <- groupChromPeaks(xdata_pp, param = PeakDensityParam()) 
Processing 253240 mz slices ... 
OK
> hasFeatures(xdata_grouped1, msLevel = 1)
[1] TRUE
> hasFeatures(xdata_grouped1, msLevel = 2)
[1] FALSE

> xdata_grouped1 <- groupChromPeaks(xdata_pp, param= PeakDensityParam(), add=TRUE, msLevel = 2L) 
Processing 259377 mz slices ... OK
> hasFeatures(xdata_grouped1, msLevel = 1)
[1] FALSE
> hasFeatures(xdata_grouped1, msLevel = 2)
[1] TRUE

I figured here is where I am losing my MS2 peaks, so I tried to back up a little:

> xdata_rtaligned <- adjustRtime(xdata_pp, param = ObiwarpParam())
Sample number 2 used as center sample.
Aligning DA_metabolomics2021_sample_050.mzML against DA_metabolomics2021_sample_111.mzML ... OK

Aligning DA_metabolomics2021_sample_127.mzML against DA_metabolomics2021_sample_111.mzML ... OK

Aligning DA_metabolomics2021_sample_143.mzML against DA_metabolomics2021_sample_111.mzML ... OK

Apply retention time correction performed on MS1 to spectra from all MS levels
Applying retention time adjustment to the identified chromatographic peaks ... OK
> hasFeatures(xdata_rtaligned, msLevel = 1)
[1] FALSE
> hasFeatures(xdata_rtaligned, msLevel = 2)
[1] FALSE

> xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam())
Processing 253240 mz slices ... OK
> hasFeatures(xdata_grouped2, msLevel = 1)
[1] TRUE
> hasFeatures(xdata_grouped2, msLevel = 2)
[1] FALSE
> xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam(), add = TRUE, msLevel = 2L)
Processing 259377 mz slices ... OK
> hasFeatures(xdata_grouped2, msLevel = 2)
[1] TRUE
> hasFeatures(xdata_grouped2, msLevel = 1)
[1] FALSE

So I think it is impossible to have groupings MS1 and MS2 peaks in the same object, is that right? I think @sneumann said something to that effect?

Best, Lisa

jorainer commented 2 years ago

Thanks for the info! OK, so it seems MS1 features get dropped when MS2 features are added. I'll have a look into the code. Just out of curiosity - what MS2 data do you have? Usually I'm grouping peaks in MS1 to define the features, but I'm not grouping MS2 chrom peaks but use them "only" for annotation of the MS1 signals (which I use for quantification).

eterlova commented 2 years ago

@jorainer Thank you for looking into this! I am trying to analyze my MS/MSe data using RAMClustR. Corey told me that I don't have to split my spectra into two separate files (one for MS and one for MSe) if I use these functions. I thought it was a much more elegant solution, but it seems not to be working

jorainer commented 2 years ago

Yes, can well be that there's a bug - I don't have such MSe data available to run unit tests on it. That's maybe why it never showed up. I'll dig into the code and would then ask you to test.

jorainer commented 2 years ago

Checked the code and all should be good with the add parameter. But then I checked your example code again, and the reason that you don't have MS1 and MS2 features in your object is that you're doing the peak grouping twice on the same input object (i.e. xdata_rtaligned). You would run the second peak grouping on MS2 on the xdata_grouped2 to add the features to the already defined features.

So, using

xdata_rtaligned <- adjustRtime(xdata_pp, param = ObiwarpParam())
xdata_grouped2 <- groupChromPeaks(xdata_rtaligned, param = PeakDensityParam())
xdata_grouped2 <- groupChromPeaks(xdata_grouped2, param = PeakDensityParam(), add = TRUE, msLevel = 2L)

should work: you need to first run the peak grouping on the aligned XCMSnExp object and then the second round of peak grouping on MS2 on the same XCMSnExp object (to add them to the already existing one).

eterlova commented 2 years ago

Thank you so much! Of course this worked!

> hasFeatures(xdata_grouped2, msLevel = 1)
[1] TRUE
> hasFeatures(xdata_grouped2, msLevel = 2)
[1] TRUE

Gonna close the issue now