rformassspectrometry / Spectra

Low level infrastructure to handle MS spectra
https://rformassspectrometry.github.io/Spectra/
34 stars 24 forks source link

Exporting to mzML #145

Closed ococrook closed 2 years ago

ococrook commented 3 years ago

I'm trying to export a spectra object with backend MsBackendMzR() to a mzML file as given by the example in the vignette. However, I get an error which is a bit cryptic. Sorry Johannes I'm no trying to break everying on purpose ....

exampleIonMob1
MSn data (Spectra) with 25000 spectra in a MsBackendMzR backend:
        msLevel     rtime scanIndex
      <integer> <numeric> <integer>
1             1   574.303    307662
2             2   295.926    130990
3             2   659.956    362070
4             1   644.558    352381
5             2   234.457     91995
...         ...       ...       ...
24996         2   214.332     79055
24997         1   612.014    331754
24998         2   199.876     69909
24999         2   429.040    215512
25000         2   159.767     44531
 ... 33 more variables/columns.

file(s):
191115_PhosB_HDMSe_0sec_01_vial01.mzML
fl <- tempfile()
export(exampleIonMob1, MsBackendMzR(), file = fl)
Error in z[1L, 1L] : subscript out of bounds
21.
stop(e)
20.
value[[3L]](cond)
19.
tryCatchOne(expr, names, parentenv, handlers[[1L]])
18.
tryCatchList(expr, classes, parentenv, handlers)
17.
tryCatch({ FUN(...) }, error = handle_error)
16.
withCallingHandlers({ tryCatch({ FUN(...) }, error = handle_error) ...
15.
FUN(...)
14.
FUN(X[[i]], ...)
13.
lapply(X, FUN_, ...)
12.
bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
11.
bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
10.
bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd, .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
9.
bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd, .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
8.
bpmapply(.write_ms_data_mzR, split(x, f), levels(f), MoreArgs = list(format = format, copy = copy), BPPARAM = BPPARAM)
7.
bpmapply(.write_ms_data_mzR, split(x, f), levels(f), MoreArgs = list(format = format, copy = copy), BPPARAM = BPPARAM)
6.
.local(object, ...)
5.
export(backend, object, ...)
4.
export(backend, object, ...)
3.
.local(object, ...)
2.
export(exampleIonMob1, MsBackendMzR(), file = fl)
1.
export(exampleIonMob1, MsBackendMzR(), file = fl)
sessionInfo()
 version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.0        stringr_1.4.0        dplyr_1.0.0          purrr_0.3.4          readr_1.4.0         
 [6] tidyr_1.1.2          tibble_3.0.1         ggplot2_3.3.2        tidyverse_1.3.0      Spectra_0.99.8      
[11] MASS_7.3-51.6        pheatmap_1.0.12      pRoloc_1.29.0        BiocParallel_1.23.2  MLInterfaces_1.69.0 
[16] cluster_2.1.0        annotate_1.67.1      XML_3.99-0.5         AnnotationDbi_1.51.3 IRanges_2.23.10     
[21] MSnbase_2.15.7       ProtGenerics_1.21.0  S4Vectors_0.27.14    mzR_2.23.1           Rcpp_1.0.4.6        
[26] Biobase_2.49.1       BiocGenerics_0.35.4 

loaded via a namespace (and not attached):
  [1] backports_1.1.10      readxl_1.3.1          BiocFileCache_1.13.1  plyr_1.8.6            splines_4.0.2        
  [6] inline_0.3.16         digest_0.6.25         htmltools_0.5.0       foreach_1.5.0         viridis_0.5.1        
 [11] fansi_0.4.1           magrittr_1.5          memoise_1.1.0         doParallel_1.0.15     mixtools_1.2.0       
 [16] openxlsx_4.1.5        limma_3.45.14         recipes_0.1.13        modelr_0.1.8          gower_0.2.2          
 [21] matrixStats_0.57.0    RcppParallel_5.0.2    askpass_1.1           lpSolve_5.6.15        prettyunits_1.1.1    
 [26] colorspace_1.4-1      rvest_0.3.6           blob_1.2.1            rappdirs_0.3.1        haven_2.2.0          
 [31] xfun_0.18             callr_3.5.1           jsonlite_1.7.1        crayon_1.3.4          RCurl_1.98-1.2       
 [36] hexbin_1.28.1         lme4_1.1-23           impute_1.63.0         survival_3.1-12       iterators_1.0.12     
 [41] glue_1.4.1            gtable_0.3.0          ipred_0.9-9           zlibbioc_1.35.0       V8_3.2.0             
 [46] pkgbuild_1.1.0        car_3.0-10            kernlab_0.9-29        rstan_2.21.2          abind_1.4-5          
 [51] scales_1.1.1          vsn_3.57.0            mvtnorm_1.1-1         DBI_1.1.0             viridisLite_0.3.0    
 [56] xtable_1.8-4          progress_1.2.2        foreign_0.8-80        bit_4.0.4             proxy_0.4-24         
 [61] mclust_5.4.6          preprocessCore_1.51.0 StanHeaders_2.21.0-6  MsCoreUtils_1.1.7     lava_1.6.8           
 [66] prodlim_2019.11.13    sampling_2.8          httr_1.4.2            FNN_1.1.3             RColorBrewer_1.1-2   
 [71] ellipsis_0.3.1        loo_2.3.1             farver_2.0.3          pkgconfig_2.0.3       nnet_7.3-14          
 [76] dbplyr_1.4.4          caret_6.0-86          labeling_0.3          tidyselect_1.1.0      rlang_0.4.6          
 [81] reshape2_1.4.4        munsell_0.5.0         cellranger_1.1.0      tools_4.0.2           LaplacesDemon_16.1.4 
 [86] cli_2.1.0             generics_0.0.2        RSQLite_2.2.1         broom_0.7.1           evaluate_0.14        
 [91] mzID_1.27.0           yaml_2.2.1            processx_3.4.4        fs_1.5.0              ModelMetrics_1.2.2.2 
 [96] knitr_1.30            bit64_4.0.5           zip_2.0.4             randomForest_4.6-14   dendextend_1.14.0    
[101] ncdf4_1.17            nlme_3.1-148          xml2_1.3.2            biomaRt_2.45.5        compiler_4.0.2       
[106] rstudioapi_0.11       curl_4.3              e1071_1.7-3           affyio_1.59.0         reprex_0.3.0         
[111] statmod_1.4.34        stringi_1.4.6         ps_1.4.0              lattice_0.20-41       Matrix_1.2-18        
[116] nloptr_1.2.2.1        vctrs_0.3.0           pillar_1.4.6          lifecycle_0.2.0       BiocManager_1.30.10  
[121] MALDIquant_1.19.3     data.table_1.12.8     bitops_1.0-6          R6_2.4.1              pcaMethods_1.81.0    
[126] affy_1.67.1           gridExtra_2.3         rio_0.5.16            codetools_0.2-16      boot_1.3-25          
[131] gtools_3.8.2          assertthat_0.2.1      openssl_1.4.3         withr_2.3.0           hms_0.5.3            
[136] grid_4.0.2            rpart_4.1-15          timeDate_3043.102     coda_0.19-4           class_7.3-17         
[141] minqa_1.2.4           rmarkdown_2.4         segmented_1.2-0       carData_3.0-4         pROC_1.16.2          
[146] lubridate_1.7.9  
jorainer commented 3 years ago

Indeed - pretty cryptic. Could you please disable parallel processing first by setting register(SerialParam()) and then try again. That way we can hopefully see the real error message (I guess something fails in mzR - we never tested writing ion mobility data).

ococrook commented 3 years ago

Unfortunatly no less crytic...

register(SerialParam())
registered()
$SerialParam
class: SerialParam
  bpisup: FALSE; bpnworkers: 1; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
  bpexportglobals: TRUE
  bplogdir: NA
  bpresultdir: NA

$SnowParam
class: SnowParam
  bpisup: FALSE; bpnworkers: 6; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
  bpexportglobals: TRUE
  bplogdir: NA
  bpresultdir: NA
  cluster type: SOCK
fl <- tempfile()
export(exampleIonMob1, MsBackendMzR(), file = fl, BPPARAM = SerialParam())
19.
stop(e)
18.
value[[3L]](cond)
17.
tryCatchOne(expr, names, parentenv, handlers[[1L]])
16.
tryCatchList(expr, classes, parentenv, handlers)
15.
tryCatch({ FUN(...) }, error = handle_error)
14.
withCallingHandlers({ tryCatch({ FUN(...) }, error = handle_error) ...
13.
FUN(...)
12.
FUN(X[[i]], ...)
11.
lapply(X, FUN_, ...)
10.
bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd, .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
9.
bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd, .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
8.
bpmapply(.write_ms_data_mzR, split(x, f), levels(f), MoreArgs = list(format = format, copy = copy), BPPARAM = BPPARAM)
7.
bpmapply(.write_ms_data_mzR, split(x, f), levels(f), MoreArgs = list(format = format, copy = copy), BPPARAM = BPPARAM)
6.
.local(object, ...)
5.
export(backend, object, ...)
4.
export(backend, object, ...)
3.
.local(object, ...)
2.
export(exampleIonMob1, MsBackendMzR(), file = fl, BPPARAM = SerialParam())
1.
export(exampleIonMob1, MsBackendMzR(), file = fl, BPPARAM = SerialParam())
jorainer commented 3 years ago

OK, indeed not very helpful - could you eventually provide an example file so that I can reproduce locally?

ococrook commented 3 years ago

I sent an email to johannes.rainer@eurac.edu with a google drive link with 2 ion mobility datasets. Let me know if you got it?

jorainer commented 3 years ago

Did not get any email yet...

ococrook commented 3 years ago

Nothing with the subject line "ion mobility package"?

jorainer commented 3 years ago

Sorry, yes, sure - I found the mail now. And export fails with any of the two files you shared, right?

ococrook commented 3 years ago

No worries, yes neither of them work.

jorainer commented 3 years ago

I finally had time to look into the files. I checked the file "2017121001.mzML". The problem with that is that I've got spectra in there without any peaks. Could you please check this also for your files from above?

sum(lengths(exampleIonMob1) == 0)

If that's the case I have to think how we're going to handle that.

jorainer commented 3 years ago

Note: I've fixed the empty spectra bug in both the BioC 3.12 release branch and the current development (master/main) branch. Could you please install any of the two to see if it fixes your problem @ococrook (e.g. with devtools::install_github("RforMassSpectrometry/Spectra", ref = "RELEASE_3_12"))?

ococrook commented 3 years ago

Great, thank you so much. I'll have a check later

ococrook commented 3 years ago

Hi @jorainer, managed to get around to check and looks good to me! My local versions exported correctly. Indeed, there are empty spectra - about 6025/25000 were empty.

Just to confirm, if we have Hdf5 backend, we first convert to mzR backend to export.

Happy to close the issue.

jorainer commented 3 years ago

If you have a Hdf5 backend you can call export(sps, backend = MsBackendMzR()), so there is no need to change the backend first. With backend you can specify which backend should be used for the export - which will also define the output format.