sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
177 stars 81 forks source link

xcmsRaw - error #683

Open crestre opened 11 months ago

crestre commented 11 months ago

I am learning about hp-lc data using xcms and obtained an error message when I was trying to import a .cdf file using xcmsRaw. The file was generated in a Waters HP-LC. I am wondering whether I am exporting the data incorrectly or if there something else that I have to be aware of. I am attaching the file just in case.

Many thanks for any help that you can provide.

Carla Export_2023_07_241045.cdf.zip

jorainer commented 11 months ago

I have no experience with Waters software - but the file you exported can indeed not be read/imported with mzR (the package we are using to import the data). Seems that a variable "instrument_name" is missing. Would it be possible to configure the Waters software to also export that variable?

sneumann commented 11 months ago

Or, leave netCDF behind and go straight to export/conversion to mzML ? Can Proteowizard https://proteowizard.sourceforge.io/ convert your data ?

Also, try to avoid the ancient xcms classes xcmsRaw and xcmsSet. They are not maintained anymore, in favour of the new xcms3 interface described in the vignette http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html

Yours, Steffen

crestre commented 10 months ago

Johaness and Steffen,

A little bit late but thanks for your response. I tried Proteowizard but it is not "seeing" the files. I am running R version 4.3.1 and cannot make mzR to work on the data from msdata. Any suggestion? Finally, as a new user of metabolomic data and potentially xcms, I found myself going through multiple vignettes/versions - in the end I figured out what was going on ... Thanks

sneumann commented 10 months ago

Could you give more information about that instrument and data ? In the netCDF I see "QDa Positive(+) Scan (100.00-600.00)Da, Centroid, CV=15" and "No Chromatography" ; "Direct Inlet Probe" ; "Electron Impact" ; "Positive Polarity". Which is not exactly matching "Waters HP-LC". Yours, Steffen

sneumann commented 10 months ago

But, partly successful. I don't think it is the missing instrument that breaks xcms:

library(xcms)
ms <- openMSfile("Export_2023_07_241045.cdf")
summary(header(ms))
...

retentionTime       lowMZ        highMZ
Min.   :180.1   Min.   :-1   Min.   :-1
1st Qu.:285.0   1st Qu.:-1   1st Qu.:-1
Median :390.0   Median :-1   Median :-1
Mean   :390.0   Mean   :-1   Mean   :-1
3rd Qu.:495.0   3rd Qu.:-1   3rd Qu.:-1
Max.   :600.0   Max.   :-1   Max.   :-1

which looks weird. There are some scans:

> length(peaks(ms))
[1] 2801
> peaks(ms)[c(1, 666, 2801)]
[[1]]
            mz intensity
 [1,] 241.4680  5891.645
 [2,] 263.2248  2627.645
...
 [9,] 399.2110  9669.469
[10,] 411.7138  7557.469

[[2]]
            mz intensity
 [1,] 114.1244  2307.645
 [2,] 120.7768  5507.645
...
[24,] 469.4253  8069.469
[25,] 556.5388  9477.469

[[3]]
           mz intensity
[1,]       NA        NA
[2,] 582.9421  7813.469

and the last one looks broken, the NA is likely to throw off calculations like min/max. Rather than "fixing" xcms to silently throw that away, I like the error (though the error message is not helpful ...) to indicate the data is broken. I don't know how to edit/fix netCDF files, though.

Yours, Steffen

sneumann commented 10 months ago

One more thing: if you ignore my comment about not using xcmsRaw, the following works:

xr <- xcmsRaw("Export_2023_07_241045.cdf", scanrange=c(1:2800))

The latter shows this data: image

but you'd need to play with parameters to find peaks:

xs <- xcmsSet("Export_2023_07_241045.cdf", scanrange=c(1:2800))
Warning message:
In xcmsSet("Export_2023_07_241045.cdf", scanrange = c(1:2800)) :
  No peaks found in sample Export_2023_07_241045.

Yours, Steffen