sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
180 stars 80 forks source link

Extract MS2 from a few samples and maintain alignment and feature assignment #515

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi!

Is there a way to extract MS2 from a few samples in an experiment but still maintain their feature assignment? I am working on a dataset of 16 samples (without MS2) plus three pooled samples from which MS2 was taken. I am having trouble extracting the MS2 using the featureSpectra function. I tried using all the samples in my data object, and I got an error because a few samples don't have MS2. I tried filtering the object to retain only the pooled samples filterFile, and although it worked, the alignment and feature correspondence was lost. Any suggestions?

Thanks a lot!

jorainer commented 3 years ago

You can use filterFile with parameter keepFeatures = TRUE to avoid the correspondence and alignment results to be dropped (this might need a more recent version of xcms, such as >= 3.10).

ghost commented 3 years ago

It didn't work. I have 'xcms_3.10.2`, and I got this error:

> data_cent.MS2 <- data_cent %>% 
+   filterFile(pooled_samples, keepFeatures = TRUE)
Error in .local(object, ...) : unused argument (keepFeatures = TRUE)

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] RColorBrewer_1.1-2  xcms_3.10.2         MSnbase_2.14.2      ProtGenerics_1.20.0
 [5] S4Vectors_0.26.1    mzR_2.22.0          Rcpp_1.0.5          BiocParallel_1.22.0
 [9] Biobase_2.48.0      BiocGenerics_0.34.0 forcats_0.5.0       stringr_1.4.0      
[13] dplyr_1.0.2         purrr_0.3.4         readr_1.4.0         tidyr_1.1.2        
[17] tibble_3.0.3        ggplot2_3.3.2       tidyverse_1.3.0    

loaded via a namespace (and not attached):
 [1] matrixStats_0.57.0          bitops_1.0-6                fs_1.5.0                   
 [4] lubridate_1.7.9             doParallel_1.0.15           httr_1.4.2                 
 [7] GenomeInfoDb_1.24.2         tools_4.0.2                 backports_1.1.10           
[10] R6_2.4.1                    affyio_1.58.0               DBI_1.1.0                  
[13] colorspace_1.4-1            withr_2.3.0                 tidyselect_1.1.0           
[16] MassSpecWavelet_1.54.0      compiler_4.0.2              preprocessCore_1.50.0      
[19] cli_2.0.2                   rvest_0.3.6                 xml2_1.3.2                 
[22] DelayedArray_0.14.1         scales_1.1.1                DEoptimR_1.0-8             
[25] robustbase_0.93-6           affy_1.66.0                 digest_0.6.25              
[28] XVector_0.28.0              pkgconfig_2.0.3             dbplyr_1.4.4               
[31] limma_3.44.3                rlang_0.4.7                 readxl_1.3.1               
[34] rstudioapi_0.11             impute_1.62.0               generics_0.0.2             
[37] jsonlite_1.7.1              mzID_1.26.0                 RCurl_1.98-1.2             
[40] magrittr_1.5                GenomeInfoDbData_1.2.3      Matrix_1.2-18              
[43] MALDIquant_1.19.3           munsell_0.5.0               fansi_0.4.1                
[46] lifecycle_0.2.0             vsn_3.56.0                  stringi_1.5.3              
[49] MASS_7.3-53                 SummarizedExperiment_1.18.2 zlibbioc_1.34.0            
[52] plyr_1.8.6                  grid_4.0.2                  blob_1.2.1                 
[55] crayon_1.3.4                lattice_0.20-41             haven_2.3.1                
[58] hms_0.5.3                   knitr_1.30                  pillar_1.4.6               
[61] GenomicRanges_1.40.0        codetools_0.2-16            reprex_0.3.0               
[64] XML_3.99-0.5                glue_1.4.2                  pcaMethods_1.80.0          
[67] BiocManager_1.30.10         modelr_0.1.8                vctrs_0.3.4                
[70] foreach_1.5.0               cellranger_1.1.0            RANN_2.6.1                 
[73] gtable_0.3.0                assertthat_0.2.1            xfun_0.18                  
[76] broom_0.7.1                 ncdf4_1.17                  iterators_1.0.12           
[79] IRanges_2.22.2              ellipsis_0.3.1    

I also tried installing the most recent release (3.11), and got the same error.

jorainer commented 3 years ago

OK, I've just seen that a version >= 3.11.1 is required. In order to use that you will have to install both MSnbase and xcms from github:

BiocManager::install("lgatto/MSnbase")
BiocManager::install("sneumann/xcms")

These are the current developmental versions - but can be seen as fairly stable as they become the stable version in 2 weeks when Bioconductor version 3.12 will be released.

ghost commented 3 years ago

It worked! Thank you so much for your help!

jorainer commented 3 years ago

Closing this issue now - feel free to re-open if needed @nathaliagg