rformassspectrometry / Spectra

Low level infrastructure to handle MS spectra
https://rformassspectrometry.github.io/Spectra/
37 stars 25 forks source link

unusual behaviour when concatenating spectra #197

Open ococrook opened 3 years ago

ococrook commented 3 years ago

Concatenating spectra is straightforward using c. However, I noticed some odd behaviour when concatenating repeatedly.

Example: out is a list of 3 Spectra. Processing has been applied.

head(out)
 [[1]]
MSn data (Spectra) with 1 spectra in a MsBackendDataFrame backend:
    msLevel     rtime scanIndex
  <integer> <numeric> <integer>
1         1   274.117       144
 ... 110 more variables/columns.
Processing:
 Switch backend from MsBackendMzR to MsBackendDataFrame [Mon Apr 12 16:26:30 2021]
 Filter: select retention time [273.804252..286.451748] on MS level(s) 1 2 [Tue Apr 13 10:20:51 2021]
 Filter: removed empty spectra. [Tue Apr 13 10:20:51 2021]
 ...8 more processings. Use 'processingLog' to list all. 

[[2]]
MSn data (Spectra) with 1 spectra in a MsBackendDataFrame backend:
    msLevel     rtime scanIndex
  <integer> <numeric> <integer>
1         1   200.403         1
 ... 117 more variables/columns.
Processing:
 Switch backend from MsBackendMzR to MsBackendDataFrame [Mon Apr 12 16:26:30 2021]
 Filter: select retention time [197.1202026..209.1037974] on MS level(s) 1 2 [Tue Apr 13 10:20:53 2021]
 Filter: removed empty spectra. [Tue Apr 13 10:20:53 2021]
 ...10 more processings. Use 'processingLog' to list all. 

[[3]]
MSn data (Spectra) with 1 spectra in a MsBackendDataFrame backend:
    msLevel     rtime scanIndex
  <integer> <numeric> <integer>
1         1   231.774        62
 ... 120 more variables/columns.
Processing:
 Switch backend from MsBackendMzR to MsBackendDataFrame [Mon Apr 12 16:26:30 2021]
 Filter: select retention time [231.51235702061..242.71660942061] on MS level(s) 1 2 [Tue Apr 13 10:21:02 2021]
 Filter: removed empty spectra. [Tue Apr 13 10:21:02 2021]
 ...10 more processings. Use 'processingLog' to list all. 

Now the following doesn't work, and the error is quite obtuse.

Reduce(c, out[c(1,2,3)])

Error in length(xi) <- max_len : cannot set length of non-(vector or list)

However, the following all work (!)

Reduce(c, out[c(1,3,2)])
Reduce(c, out[c(3,1,2)])
Reduce(c, out[c(3,2,1)])
Reduce(c, out[c(2,1,3)])

e.g.:

MSn data (Spectra) with 3 spectra in a MsBackendDataFrame backend:
    msLevel     rtime scanIndex
  <integer> <numeric> <integer>
1         1   200.403         1
2         1   274.117       144
3         1   231.774        62
 ... 120 more variables/columns.
Processing:
 Switch backend from MsBackendMzR to MsBackendDataFrame [Mon Apr 12 16:26:30 2021]
 Filter: select retention time [197.1202026..209.1037974] on MS level(s) 1 2 [Tue Apr 13 10:20:53 2021]
 Filter: removed empty spectra. [Tue Apr 13 10:20:53 2021]
 ...30 more processings. Use 'processingLog' to list all.

Thoughts?

sessionInfo()
R version 4.0.3 Patched (2020-10-23 r79367)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] hdxmspro_1.0.0      IonMobility_1.0.0   Spectra_1.1.13      ProtGenerics_1.23.7 BiocParallel_1.23.2 S4Vectors_0.27.14   BiocGenerics_0.35.4

loaded via a namespace (and not attached):
 [1] pkgload_1.1.0        jsonlite_1.7.1       splines_4.0.3        foreach_1.5.0        StanHeaders_2.21.0-6 RcppParallel_5.0.2  
 [7] assertthat_0.2.1     BiocManager_1.30.10  yaml_2.2.1           remotes_2.2.0        sessioninfo_1.1.1    globals_0.13.1      
[13] pillar_1.4.6         backports_1.1.10     lattice_0.20-41      glue_1.4.1           digest_0.6.25        colorspace_1.4-1    
[19] htmltools_0.5.1.1    Matrix_1.2-18        pkgconfig_2.0.3      devtools_2.3.2       rstan_2.21.2         listenv_0.8.0       
[25] purrr_0.3.4          scales_1.1.1         processx_3.4.4       tibble_3.0.1         mgcv_1.8-33          generics_0.0.2      
[31] IRanges_2.23.10      ggplot2_3.3.2        usethis_1.6.3        ellipsis_0.3.1       withr_2.3.0          cli_2.1.0           
[37] survival_3.2-7       magrittr_1.5         crayon_1.3.4         memoise_1.1.0        evaluate_0.14        ps_1.4.0            
[43] fs_1.5.0             fansi_0.4.1          future_1.19.1        nlme_3.1-150         MASS_7.3-53          pkgbuild_1.1.0      
[49] loo_2.3.1            tools_4.0.3          prettyunits_1.1.1    BiocStyle_2.17.1     matrixStats_0.57.0   lifecycle_0.2.0     
[55] V8_3.2.0             munsell_0.5.0        glmnet_4.0-2         callr_3.5.1          compiler_4.0.3       rlang_0.4.6         
[61] grid_4.0.3           iterators_1.0.12     rstudioapi_0.11      MsCoreUtils_1.1.7    rmarkdown_2.4        testthat_2.3.2      
[67] gtable_0.3.0         codetools_0.2-16     inline_0.3.16        curl_4.3             R6_2.4.1             gridExtra_2.3       
[73] knitr_1.30           dplyr_1.0.0          future.apply_1.6.0   rprojroot_1.3-2      shape_1.4.5          desc_1.2.0          
[79] Rcpp_1.0.6           vctrs_0.3.0          tidyselect_1.1.0     xfun_0.19     
jorainer commented 3 years ago

Can you eventually try to use concatenateSpectra instead of c? concatenateSpectra(out) should work - or alternatively do.call("c", out)? It might not be related, but I noticed that c fails if spectra have names.

ococrook commented 3 years ago

interestingly, do.call works. Is concatenateSpectra in the devel version?

jorainer commented 3 years ago

Actually, I've never heard of Reduce, so do.call is what I usually use. The concatenateSpectra is in the devel version - it does the same than c, but it does not break when the provided list of Spectra has names.