open ABI WIFF files with mzR #259

tnaake commented 2 years ago


I am currently trying to read mzML files from ABI wiff files using mzR/Spectra. My OS is Windows 10 and Proteowizard version for wiff conversion is 3.0.22015 64-bit. Loading the mzML under Ubuntu is not succesful as well (see below).

  1. Windows

Under Windows (mzR v2.28.0) I am using the following command to load the mzML file:

Error: Can not open file foo.mzML! Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Unknown binary data type.

The issue also appeared in different flavors here and here Unfortunately, I cannot update my mzR version via BiocManager::install("sneumann/mzR", ref = "feature/updatePwiz_3_0_21263") (compilation fails here for me under Windows) and thus, I cannot test if this branch fixes the issue.

  1. Ubuntu

Under Ubuntu I was able to install mzR from the branch feature/updatePwiz_3_0_21263. I then continued to test if I can load the mzML file on Ubuntu (20.04):

> mzR::openMSfile("foo.mzML")
Mass Spectrometry file handle.
Filename:  foo.mzML
Number of scans:  0
> Spectra::Spectra("foo.mzML", backend = MsBackendMzR())
Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error: different row counts implied by arguments

There is no error, but Number of scans is 0.

I have attached the mzML file for reference. For index 0 and 1 it is a binaryDataArrayList of length 3 (time array, intensity array, non-standard array). Removing the non-standard array entry and setting the length to 2 does not solve the problem.

> mzR::openMSfile("foo_cut.mzML")
Mass Spectrometry file handle.
Filename:  foo_cut.mzML
Number of scans:  0

I get the same output when I run the command under Windows with mzR v2.28.0.

I was wondering if you could help to tell what the source of the error is.

Many thanks!

jorainer commented 2 years ago

I had a look at the files and they actually don't have spectra in it but chromatograms. You can therefore not read them with Spectra (or the mzR::header, mzR::peaks functions). You should be able to read the data with the readSRMData from MSnbase which returns you a MChromatograms object. Also here, to read the foo.mzML file you'll need the newere mzR/proteowizard version. The foo_cut.mzML can be read with the normal mzR.

I hadn't the chance to work on the Chromatograms package for a long time not, but (once finished) that package should be the counterpart of Spectra, just for chromatographic data.

Side note: I suggest to use the proteowizard docker image for conversion to get reliable/reproducible results. I'm using it on our cluster to convert our Sciex wiff files. You can find some information here.

tnaake commented 2 years ago

Hi @jorainer

many thanks for the prompt reply and fix. Works now!

I was wondering if it could help for the future if the man page of openMSfile states that the function will take the information from the spectrum/spectrumList entries. Currently, (at least for me) it is unclear with what kind of mzML files openMSfile is able to read.

I will close the issue then - many thanks again for the help :)