lgatto / MSnbase

Base Classes and Functions for Mass Spectrometry and Proteomics
http://lgatto.github.io/MSnbase/
123 stars 50 forks source link

Trouble with MSnbase's writeMSData function #582

Closed calstevemart closed 6 months ago

calstevemart commented 1 year ago

Hello Laurent,

At the request of a curator on my team, I am attempting to implement a galaxy tool wrapper around the writeMSData function from your MSnbase library. I however keep running into issues. I am calling the function like so: writeMSData(rdata, file = paste0("writeMSData", ".mzML"), copy = FALSE)

Error in mzR::openMSfile(x, backend = NULL) :
 File PipelineTesting_RPOS_ToF10_B1SRD80.mzML not found
Error in .writeMSData(object = object, file = file, outformat = outformat,  :
 length of 'file' has to match the number of samples

The above is when I clear the contents of the rdata@processingData@files attribute, in an attempt to get writeMSData to not try and locate the original .mzML file. I have set copy = False in the function call as I thought that meant it would not try and copy anything from the original file. Despite this, It fails as it cannot find the file referenced in rdata@processingData@files.

Is it possible to create a new .mzML file from an MsnExp / OnDiskMsnExp object, without an original .mzMl file to refer to? I have scoured the documentation, and the guide you authored in April ( https://www.bioconductor.org/packages/devel/bioc/vignettes/MSnbase/inst/doc/v01-MSnbase-demo.html ) seemed (to me) to imply that this is possible.

Here is a github link to what I have so far

Any help, even if to tell me that what I am trying is not possible, would be greatly appreciated.

Thanks, callum.

jorainer commented 1 year ago

Dear Callum,

if rdata is an OnDiskMSnExp object it will still need the original mzML file even if you set copy = FALSE. The OnDiskMSnExp object uses the on-disk mode of MSnbase which means only general spectra data is loaded into memory while the full MS data (m/z and intensity values) are read on-the-fly from the original data file. Thus, I guess you get the error above because you don't have access to the original data file.

Note that, depending on what your wrapper and tool is supposed to do, you might want to use the newer, more flexible Spectra package for MS data handling, import and export.