fgcz / rawrr

Access Orbitrap data in R lang using C# mono assembly - bioconductor package
https://bioconductor.org/packages/rawrr/
53 stars 9 forks source link

Missing Data readSpectrum(...) #21

Closed rokaempf closed 3 years ago

rokaempf commented 3 years ago

Hi cpanse and tobiakso

I've noticed that there are some parameters missing in the rawRspectrum after importing rawfile. However not all those parameters may be set in our experiment. But I also get a wrong reading of Base Peak Intensity and Base Peak Mass.

Thanks for helping!

RawRspectrum:

> Total Ion Current:     4870947
> Scan Low Mass:     50
> Scan High Mass:    250
> Scan Start Time (Min):     0
> Scan Number:   1
> Base Peak Intensity:   -1
> Base Peak Mass:    -1
> Scan Mode:     FTMS + p NSI Full ms [50.00-250.00]
> ======= Instrument data =====   :     
> 
> Multiple Injection:   
> 
> Multi Inject Info:    
> 
> AGC:  On
> Micro Scan Count: 1
> Scan Segment: 0
> Scan Event:   0
> Master Index: 0
> Charge State: 1
> Monoisotopic M/Z: 78.0468
> Ion Injection Time (ms):  100.000
> Max. Ion Time (ms):   
> 
> FT Resolution:    30000
> MS2 Isolation Width:  0.0
> MS2 Isolation Offset:     
> 
> AGC Target:   
> 
> HCD Energy:   
> 
> Analyzer Temperature:     
> 
> === Mass Calibration:     
> 
> Conversion Parameter B:   47557789.235
> Conversion Parameter C:   -2547049.695
> Temperature Comp. (ppm):  
> 
> RF Comp. (ppm):   
> 
> Space Charge Comp. (ppm):     
> 
> Resolution Comp. (ppm):   
> 
> Number of Lock Masses:    
> 
> Lock Mass #1 (m/z):   
> 
> Lock Mass #2 (m/z):   
> 
> Lock Mass #3 (m/z):   
> 
> LM Search Window (ppm):   
> 
> LM Search Window (mmu):   
> 
> Number of LM Found:   
> 
> Last Locking (sec):   
> 
> LM m/z-Correction (ppm):  
> 
> === Ion Optics Settings:  
> 
> S-Lens RF Level:  
> 
> S-Lens Voltage (V):   
> 
> Skimmer Voltage (V):  
> 
> Inject Flatapole Offset (V):  
> 
> Bent Flatapole DC (V):    
> 
> MP2 and MP3 RF (V):   
> 
> Gate Lens Voltage (V):    
> 
> C-Trap RF (V):    
> 
> ====  Diagnostic Data:    
> 
> Dynamic RT Shift (min):   
> 
> Intens Comp Factor:   
> 
> Res. Dep. Intens:     
> 
> CTCD NumF:    
> 
> CTCD Comp:    
> 
> CTCD ScScr:   
> 
> RawOvFtT:     
> 
> LC FWHM parameter:    
> 
> Rod:  
> 
> PS Inj. Time (ms):    
> 
> AGC PS Mode:  
> 
> AGC PS Diag:  
> 
> HCD Energy eV:    
> 
> AGC Fill:     
> 
> Injection t0:     
> 
> t0 FLP:   
> 
> Access Id:    
> 
> Analog Input 1 (V):   
> 
> Analog Input 2 (V):   
tobiasko commented 3 years ago

Hi @rokaempf,

I would not call it a "wrong reading", because it is most likely a crude way of reporting a missing value or a null value. And our code simply didn't expect this to happen. So maybe we should "translate" this -1 to an NA which indicates missing values in R. Or we keep the -1 (the conservative way, since that is what the API returned) and have a check in the plotting function that throws an error. Not sure which way to go yet. I would only consider the first, if Thermo confirms that our assumption is correct. I could well be that they (Thermo) are using this mechanism for all/many numeric parameters.

The other "missing parameters" seem to be truly missing. The problem with the raw format is that is has evolved organically, so maybe these key:value pairs are really not part of what the instrument logs to disc. We should check that.

Anyway, thanks for reporting this problem!

tobiasko commented 3 years ago

Hi @rokaempf,

we found the issue that creates the -1 value and we hope to fix it soon. I will keep you posted.

Greetings, Tobi

cpanse commented 3 years ago

@rokaempf

install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawR_0.1.3.tar.gz', repo=NULL)

Rplot


cp@lilith:~/Downloads/data > cat rawR_example.R | R --no-save 

R version 4.0.1 (2020-06-06) -- "See Things Now"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> #R
> 
> # MD5 (2020_11_06_DBDI_Benzene_01.raw) = a44cd435c8775835f22fc1f76b553d70
> rawfile <- "2020_11_06_DBDI_Benzene_01.raw"
> spec <- rawR::readSpectrum(rawfile, scan = 1:100)
> 
> spec[[1]]
Total Ion Current:   4870946
Scan Low Mass:   50
Scan High Mass:  250
Scan Start Time (Min):   0
Scan Number:     1
Base Peak Intensity:     1437918
Base Peak Mass:  78.04678
Scan Mode:   FTMS + p NSI Full ms [50.00-250.00]
AGC:    On
Micro Scan Count:   1
Scan Segment:   0
Scan Event: 0
Master Index:   0
Charge State:   1
Monoisotopic M/Z:   78.0468
Ion Injection Time (ms):    100.000
FT Resolution:  30000
MS2 Isolation Width:    0.0
Conversion Parameter B: 47557789.235
Conversion Parameter C: -2547049.695
> names(spec[[1]])
 [1] "scan"                     "basePeak"                
 [3] "TIC"                      "massRange"               
 [5] "scanType"                 "rtinseconds"             
 [7] "pepmass"                  "centroidStream"          
 [9] "HasCentroidStream"        "centroid.mZ"             
[11] "centroid.intensity"       "title"                   
[13] "charge"                   "monoisotopicMz"          
[15] "mZ"                       "intensity"               
[17] "AGC:"                     "Micro Scan Count:"       
[19] "Ion Injection Time (ms):" "Scan Segment:"           
[21] "Scan Event:"              "Master Index:"           
[23] "Elapsed Scan Time (sec):" "API Source CID Energy:"  
[25] "Average Scan by Inst:"    "Charge State:"           
[27] "Monoisotopic M/Z:"        "MS2 Isolation Width:"    
[29] "MS3 Isolation Width:"     "MS4 Isolation Width:"    
[31] "MS5 Isolation Width:"     "MS6 Isolation Width:"    
[33] "MS7 Isolation Width:"     "MS8 Isolation Width:"    
[35] "MS9 Isolation Width:"     "MS10 Isolation Width:"   
[37] "FT Analyzer Settings:"    "FT Analyzer Message:"    
[39] "FT Resolution:"           "Conversion Parameter I:" 
[41] "Conversion Parameter A:"  "Conversion Parameter B:" 
[43] "Conversion Parameter C:"  "Conversion Parameter D:" 
[45] "Conversion Parameter E:" 
> 
> jpeg("Rplot.jpg")
> plot(spec[[1]]) #intensities are displayed negative
> dev.off()
null device 
          1 
> 
> spec[[1]]$"Ion Injection Time (ms):" #can't access list element by name
[1] "100.000"
> 
> spec[[1]]$"Monoisotopic M/Z:"
[1] "78.0468"
> 
> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS  10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.0.1 rawR_0.1.3    
> 
cp@lilith:~/Downloads/data > 
tobiasko commented 3 years ago

So long story short: It was a bug in our code, not in the API or how it reports missing values. Shame on us! 😉 But it's fixed now! THX again @rokaempf for making us aware of the problem.