fgcz / rawrr

Access Orbitrap data in R lang using C# mono assembly - bioconductor package
https://bioconductor.org/packages/rawrr/
52 stars 8 forks source link

noise values for a raw file collected in reduced profile mode #37

Closed cpanse closed 1 year ago

cpanse commented 3 years ago

It is possible to use rawrr to access the noise values for a raw file collected in reduced profile mode?

From reading the code in rawrr.cs, it seems that the noise is only read for centroided data, but I wanted to be sure.

by @davidsbutcher

tobiasko commented 3 years ago

@davidsbutcher What is a "reduced profile mode"???

davidsbutcher commented 3 years ago

@tobiasko Reduced profile mode is a setting available when collecting FT spectra in Thermo software. A Thermo algorithm is used to remove portions of the spectrum which are "empty", reducing the file size per scan by more than 10X. Since we (the Bio subgroup of ICR at NHMFL) often perform LC-MS runs which collect hundreds or thousands of scans, we have to use reduced profile mode to keep the file sizes manageable.

I have no idea whether collecting the data in reduced profile mode has any impact on reading the scan data using the Thermo .NET assembly, but when I open the raw files in Qual Browser or Freestyle the resolution and noise for every peak is available, so it must be in the file somewhere.

tobiasko commented 3 years ago

ok. I am not aware of this mode on any of our systems, but maybe it is specific for a certain family of Thermo instruments. What type of instrument are you using? Could you provide us with a raw file?

davidsbutcher commented 3 years ago

The instrument is a 21 T FT-ICR MS with a heavily modified Thermo Velos Pro front end. An example raw file for an intact protein standard mixture collected on the instrument can be found here: https://drive.google.com/open?id=1-17o0AyULf8Drp0VZFzpKyQIg-EkGIhD&authuser=dbutc001%40fiu.edu&usp=drive_fs

tobiasko commented 3 years ago

Okidoki! We will give it a try.

tobiasko commented 3 years ago

Hmmmm...ok, so the data is LC-MS and you really used a nano ESI source?

All FTMS + p scans are the once where you should like to know if we have noise values available? MS and MS2 level? We can ignore all the ITMS scans? Which detector was used to record these?

davidsbutcher commented 3 years ago

Yes, we only care about noise/resolution values for FTMS scans at MS1 and MS2 levels. The detector for the FTMS scans is a custom-built ICR cell with electronics controlled by custom software. To my knowledge this shouldn't affect the data in the raw file.

tobiasko commented 3 years ago

Really Interesting! The peak list of a random FTMS + p NSI Full ms [300.00 - 2000.00] scan displayed in freestyle really reminds me of a centroided scan coming from an Orbitrap detector, since Resolution, Charge, Noise, Width, Baseline values are listed. We will need to check what we can extract using the NewRawFileReader .NET assembly. I will keep you posted!

moritzmadern commented 3 years ago

@tobiasko Hi, first off thank you for providing this useful package!

In my project I also require direct access to the noise values. What I have observed is that for spectras recorded in profile-mode, I cannot retrieve the noise values using rawrr, even though they are visible when inspecting the raw data in a raw-file viewer (with centroided spectras I don't have this issue, here I can retrieve the noise-values no problem). In particular, this is true even within the same thermo raw-file where MS2 scans were collected in centroided mode, and MS1 scans in profile mode; i.e. only for the centroided MS2 scans I can access noise (and baseline) values, and not for the MS1 scans. I can also provide a rawfile where this is the case: https://drive.google.com/file/d/1v-zdi4Bf0CzsWHPSjcoi5mrsCjVQKRPH/view?usp=sharing

As a sidenote: Me and my colleagues don't know anything about "reduced profile mode", I am pretty sure we are collecting in normal profile mode on an HF-X mass spectrometer.

Best regards, Moritz

tobiasko commented 3 years ago

Hi @moritzmadern, I was always assuming that noise is a specific attribute of centroided scan data. Our R code therefore checks if centroid information is available in the raw file and if not, you won't get these values displayed. Maybe my assumption was wrong and noise values also exist for profile mode data. What is your raw-file viewer? Freestyle? QualBrowser?

luechtian commented 3 years ago

Hi @tobiasko, i want to thank you too! This package is really helpful.

I am working in the lipidomics field and we are recording mainly direct infusion data in profile mode with polarity switching. For peak identification I need peak resolution values to calculate the slope for each scan mode. Currently I am using the export-function in Freestyle and calculate this resolution gradient manually in Excel. Therefore it would be really nice to read out resolution data in profile mode by your package to automate these steps.

To check which parameters are only accessible in centroid mode, I measured a control in profile and centroid mode. It would have been easier to ask you directly or read the source code, but also I was curious, if there are any differences in the analysis.

Not accessible parameters with spectra in profile mode: "noises", "resolutions" and "charges"

Just in case you need additional raw-files: https://drive.google.com/drive/folders/1WgmCOegk98qkssn2a94ps14umNs-eFJG

Best wishes, Christian

moritzmadern commented 3 years ago

@tobiasko I am using Xcalibur QualBrowser = )

As far as noise is concerned: Noise is definitely not specific to centroided data, rather it is specific to orbitrap data in general. Although I currently struggle to get all the information I want from Thermo regarding this somewhat cryptic variable, in short, noise is what an orbitrap always measures as electrical background signal, and this background signal not visible anymore in the raw files (except for those noise values, which are a reduced form of this information); i.e. only signals above the noise level remain, irrespective if recorded in profile mode or centroid mode.

(Note: As for what @luechtian was saying, after checking with my data it I also found "charges" and "resolutions" unaccessible specifically for spectra recorded in profile mode using rawrr. Just in case you wanted to check with the raw-file I provided. I did not notice before/didn't check it before)

tobiasko commented 3 years ago

Hi @moritzmadern,

we just had a session using your HF-X file. What we discovered so far is: The file contains a data stream type for the profile scans that we have ignored so far. It is called the preferred stream (no clue why). We also found corresponding C# getter methods (defined in the RawFileReader .NET assembly) that allow us to request:

PreferredIntensities PreferredMasses PreferredNoises PreferredResolutions

The returned values match with the values displayed in the Freestyle Spectrum List View. It would be possible to make these values also available to R. Do you have any background information on the Resolution? Is it also written by all FTMS instrument (or only by Orbitraps)?

Best, Tobi

moritzmadern commented 3 years ago

@tobiasko Also no clue why it is called this way, but I am very happy to hear that. Making them available in R would be great = ) (Probably not only for me but others too!)

Unfortunately I don't know the answer to that question (and sadly neither do my colleagues because we only have Orbitrap FT-MS instruments). My boss who is head of the MS facility says he would assume so as ICR-MS instruments boast better resolution and the idea of including that information in the raw file seems natural. Maybe @davidsbutcher can answer that since they said they are using a FT-ICR MS?

Best regards, Moritz

Linda24bc commented 3 years ago

Hi, I just want to check if we can read the noise value for the reduced profile raw data right now. Thank you

davidsbutcher commented 2 years ago

@tobiasko Also no clue why it is called this way, but I am very happy to hear that. Making them available in R would be great = ) (Probably not only for me but others too!)

Unfortunately I don't know the answer to that question (and sadly neither do my colleagues because we only have Orbitrap FT-MS instruments). My boss who is head of the MS facility says he would assume so as ICR-MS instruments boast better resolution and the idea of including that information in the raw file seems natural. Maybe @davidsbutcher can answer that since they said they are using a FT-ICR MS?

Best regards, Moritz

As a (very belated) reply to the original question, I cannot say for sure whether all Thermo FTMS instruments write the data to the raw file the same way. I just know that our custom-built instrument writes similar data to the raw file as an Orbitrap.

I'd also like to restate my interest in having this data made available using rawrr. It seems like a popular request!

chscho commented 2 years ago

Dear all

First I'd like to thank @tobiasko and @cpanse for this great tool! It greatly simplifies the extraction of certain information from raw files in Rscripts.

Currently, I'm facing a similar problem as already discussed in this thread. My initial goal was the extraction of the S/N levels of a specific scan ID out of PRM data together with the charges, noises, intensities, and resolution information on MS2 level (basically, all columns displayed in Freestyle in "Spectrum List"). I've tried PRM files generated by Lumos, Exploris, Qex-HF and Qex-Plus with RAWrr, but all of them only display the intensity information, but not the charg, noise and resolution information. In addition, I've observed a strange entry right after "intensity" for the Lumos dataset (see attached screenshot from a PRM file of a Lumos):

Screenshot 2021-11-05 at 13 51 55

This looks to me, as if the wanted data is there, but just not displayed correctly...

But of course, based on your earlier discussion with @moritzmadern some of this information could also be burried in a different data stream. Can you please explain to me, how I can access this data in my PRM files using RAWrr?

Just in case it helps, I've uploaded the mentioned PRM files from the Lumos and QexHF here: https://filesender.switch.ch/filesender2/?s=download&token=561cb652-1343-4a63-b4ab-4adb99fa8ef8

Thank you already in advance for looking into this issue!

Best, Christian

PS: I also had a (rather random) look at some other RAW files acquired in DDA and DIA mode from different devices (Eclipse, Exploris, Fusion, Qex-HF) to better understand if there are other differences between RAW files among different instruments:

This behavior is rather puzzling to me, because as @moritzmadern already mentioned, these informations are characteristics of a measurement by the Orbitrap. Hence I would actually expect, that it should be similarly accessible in all different kinds of RAW files on MS1 and MS2 level (As it actually seems to be, if the data is opened in Freestyle).

PPS: as a side question: Can you explain the logic behind reporting 'centroidStream = FALSE' and 'HasCentroidStream = TRUE, Length = xxx', as shown in the screenshot? Isn't this contradictory - or does this indicate some wrong instument settings?

tobiasko commented 2 years ago

Hi @chscho,

could you please upload the code that you used to extract the spectral data. In addition, please upload your R session information and the language settings of your OS.

THX, Tobi

tobiasko commented 2 years ago

I guess what you do is:

> S <- rawrr::readSpectrum("/Users/tobiasko/Downloads/Lumos_PRM.raw", scan = 2)
> str(S)
List of 1
 $ :List of 54
  ..$ scan                            : num 2
  ..$ basePeak                        : num [1:2] 178 1024
  ..$ TIC                             : num 17185
  ..$ massRange                       : num [1:2] 150 1500
  ..$ scanType                        : chr "FTMS + p NSI Full ms2 505.5384@hcd35.00 [150.0000-1500.0000]"
  ..$ rtinseconds                     : num 0.846
  ..$ pepmass                         : num 506
  ..$ centroidStream                  : logi FALSE
  ..$ HasCentroidStream               : chr "True, Length=126"
  ..$ centroid.mZ                     : num [1:126] 152 152 153 153 154 ...
  ..$ centroid.intensity              : num [1:126] 79.5 86.8 81.5 88.3 73.1 ...
  ..$ title                           : chr "File: Lumos_PRM.raw; SpectrumID: ; scans: 2"
  ..$ charge                          : num 3
  ..$ monoisotopicMz                  : num 0
  ..$ mZ                              : num [1:1639] 149 149 149 149 152 ...
  ..$ intensity                       : num [1:1639] 0 0 0 0 0 ...
  ..$                             : chr "\t\t\t\t\t\t\t\t\t \t\t\t\t\t\t\t  \t\t\t\t\t\t\t\t\t\t\t\t\t\t"
  ..$ Scan Description:               : chr "                "
  ..$ AGC:                            : chr "Predicted   "
  ..$ Micro Scan Count:               : chr "1"
  ..$ Ion Injection Time (ms):        : chr "502.000"
  ..$ Elapsed Scan Time (sec):        : chr "0.522"
  ..$ Average Scan by Inst:           : chr "No"
  ..$ Orbitrap Resolution:            : chr "240000"
  ..$ API Process Delay:              : chr "-1.000"
  ..$ Dependency Type:                : chr "0"
  ..$ Multi Inject Info:              : chr "                                "
  ..$ Master Scan Number:             : chr "0"
  ..$ Monoisotopic M/Z:               : chr "0.0000"
  ..$ Charge State:                   : chr "3"
  ..$ Error in isotopic envelope fit: : chr "0.00"
  ..$ HCD Energy:                     : chr "35.00                    "
  ..$ HCD Energy eV:                  : chr "30.74                    "
  ..$ MS2 Isolation Width:            : chr "0.40"
  ..$ SPS Masses:                     : chr "                                                                                                               "| __truncated__
  ..$ SPS Masses Continued:           : chr "                                                                                                               "| __truncated__
  ..$ Access ID:                      : chr "0"
  ..$ Conversion Parameter I:         : chr "0.000000"
  ..$ Conversion Parameter A:         : chr "0.000000"
  ..$ Conversion Parameter B:         : chr "211775679.822298"
  ..$ Conversion Parameter C:         : chr "-78210567.548349"
  ..$ Conversion Parameter D:         : chr "0.000000"
  ..$ Conversion Parameter E:         : chr "0.000000"
  ..$ Temperature Comp. (ppm):        : chr "-4.38"
  ..$ RF Comp. (ppm):                 : chr "0.00"
  ..$ Space Charge Comp. (ppm):       : chr "-2.69"
  ..$ Resolution Comp. (ppm):         : chr "-0.09"
  ..$ Number of LM Found:             : chr "0"
  ..$ LM Correction (ppm):            : chr "0.00"
  ..$ RawOvFtT:                       : chr "55538.5"
  ..$ Injection t0:                   : chr "0.000"
  ..$ Reagent Ion Injection Time (ms):: chr "0.000"
  ..$ FAIMS Voltage On:               : chr "No"
  ..$ FAIMS CV:                       : chr "0.000"
  ..- attr(*, "class")= chr "rawrrSpectrum"
 - attr(*, "class")= chr "rawrrSpectrumSet"
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] protViz_0.7.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7                    png_0.1-7                     Biostrings_2.62.0            
 [4] assertthat_0.2.1              digest_0.6.28                 utf8_1.2.2                   
 [7] mime_0.12                     BiocFileCache_2.2.0           R6_2.5.1                     
[10] GenomeInfoDb_1.30.0           stats4_4.1.2                  RSQLite_2.2.8                
[13] evaluate_0.14                 httr_1.4.2                    ggplot2_3.3.5                
[16] pillar_1.6.4                  zlibbioc_1.40.0               rlang_0.4.12                 
[19] curl_4.3.2                    blob_1.2.2                    S4Vectors_0.32.0             
[22] rmarkdown_2.11                AnnotationHub_3.2.0           munsell_0.5.0                
[25] RCurl_1.98-1.5                bit_4.0.4                     shiny_1.7.1                  
[28] compiler_4.1.2                httpuv_1.6.3                  xfun_0.28                    
[31] pkgconfig_2.0.3               BiocGenerics_0.40.0           htmltools_0.5.2              
[34] tidyselect_1.1.1              KEGGREST_1.34.0               tibble_3.1.5                 
[37] GenomeInfoDbData_1.2.7        interactiveDisplayBase_1.32.0 rawrr_1.2.0                  
[40] IRanges_2.28.0                codetools_0.2-18              fansi_0.5.0                  
[43] crayon_1.4.2                  dplyr_1.0.7                   dbplyr_2.1.1                 
[46] later_1.3.0                   bitops_1.0-7                  rappdirs_0.3.3               
[49] grid_4.1.2                    gtable_0.3.0                  xtable_1.8-4                 
[52] lifecycle_1.0.1               DBI_1.1.1                     magrittr_2.0.1               
[55] scales_1.1.1                  cachem_1.0.6                  XVector_0.34.0               
[58] promises_1.2.0.1              ellipsis_0.3.2                filelock_1.0.2               
[61] generics_0.1.1                vctrs_0.3.8                   tools_4.1.2                  
[64] bit64_4.0.5                   Biobase_2.54.0                glue_1.4.2                   
[67] purrr_0.3.4                   BiocVersion_3.14.0            fastmap_1.1.0                
[70] yaml_2.2.1                    colorspace_2.0-2              AnnotationDbi_1.56.1         
[73] BiocManager_1.30.16           ExperimentHub_2.2.0           memoise_2.0.0                
[76] knitr_1.36                   
> 

And yes the returned object structure looks strange. I have the feeling that there is one or more key-value pairs that are not correctly parsed (see ..$  : chr "\t\t\t\t\t\t\t\t\t \t\t\t\t\t\t\t \t\t\t\t\t\t\t\t\t\t\t\t\t\t").

> S[[1]]$``
[1] "\t\t\t\t\t\t\t\t\t \t\t\t\t\t\t\t  \t\t\t\t\t\t\t\t\t\t\t\t\t\t"

We (@cpanse ) will need to look into our C# code to find out why this happens.

tobiasko commented 2 years ago

Your QExHF-X scan data looks pretty different (because its a different instrument):

> S <- rawrr::readSpectrum("/Users/tobiasko/Downloads/QexHF_PRM.raw", scan = 2)
> str(S)
List of 1
 $ :List of 79
  ..$ scan                         : num 2
  ..$ basePeak                     : num [1:2] 110 11520
  ..$ TIC                          : num 210092
  ..$ massRange                    : num [1:2] 100 1040
  ..$ scanType                     : chr "FTMS + p NSI Full ms2 333.5258@hcd25.00 [100.0000-1040.0000]"
  ..$ rtinseconds                  : num 0.629
  ..$ pepmass                      : num 334
  ..$ centroidStream               : logi FALSE
  ..$ HasCentroidStream            : chr "True, Length=63"
  ..$ centroid.mZ                  : num [1:63] 101 105 108 108 108 ...
  ..$ centroid.intensity           : num [1:63] 2326 1618 1836 7613 3019 ...
  ..$ title                        : chr "File: QexHF_PRM.raw; SpectrumID: ; scans: 2"
  ..$ charge                       : num 3
  ..$ monoisotopicMz               : num 0
  ..$ mZ                           : num [1:861] 99 99 99 99 101 ...
  ..$ intensity                    : num [1:861] 0 0 0 0 0 ...
  ..$ Multiple Injection:          : chr "h "
  ..$ Multi Inject Info:           : chr "IT=150 "
  ..$ AGC:                         : chr "On"
  ..$ Micro Scan Count:            : chr "1"
  ..$ Scan Segment:                : chr "1"
  ..$ Scan Event:                  : chr "2"
  ..$ Master Index:                : chr "0"
  ..$ Charge State:                : chr "3"
  ..$ Monoisotopic M/Z:            : chr "0.0000"
  ..$ Ion Injection Time (ms):     : chr "150.000"
  ..$ Max. Ion Time (ms):          : chr "150.00"
  ..$ FT Resolution:               : chr "60000"
  ..$ MS2 Isolation Width:         : chr "0.40"
  ..$ MS2 Isolation Offset:        : chr "0.00"
  ..$ AGC Target:                  : chr "3000000"
  ..$ HCD Energy:                  : chr "25.00"
  ..$ Analyzer Temperature:        : chr "31.59"
  ..$ === Mass Calibration: ===:   : chr ""
  ..$ Conversion Parameter B:      : chr "211811959.5566"
  ..$ Conversion Parameter C:      : chr "68246182.2212"
  ..$ Temperature Comp. (ppm):     : chr "-7.51"
  ..$ RF Comp. (ppm):              : chr "0.01"
  ..$ Space Charge Comp. (ppm):    : chr "-0.01"
  ..$ Resolution Comp. (ppm):      : chr "0.39"
  ..$ Number of Lock Masses:       : chr "0"
  ..$ Lock Mass #1 (m/z):          : chr "0.0000"
  ..$ Lock Mass #2 (m/z):          : chr "0.0000"
  ..$ Lock Mass #3 (m/z):          : chr "0.0000"
  ..$ LM Search Window (ppm):      : chr "0.0"
  ..$ LM Search Window (mmu):      : chr "0.0"
  ..$ Number of LM Found:          : chr "0"
  ..$ Last Locking (sec):          : chr "0.0"
  ..$ LM m/z-Correction (ppm):     : chr "0.00"
  ..$ === Ion Optics Settings: ===:: chr ""
  ..$ S-Lens RF Level:             : chr "60.00"
  ..$ S-Lens Voltage (V):          : chr "21.00"
  ..$ Skimmer Voltage (V):         : chr "15.00"
  ..$ Inject Flatapole Offset (V): : chr "5.00"
  ..$ Bent Flatapole DC (V):       : chr "2.00"
  ..$ MP2 and MP3 RF (V):          : chr "598.00"
  ..$ Gate Lens Voltage (V):       : chr "1.88"
  ..$ C-Trap RF (V):               : chr "1010.0"
  ..$ ====  Diagnostic Data:  ====:: chr ""
  ..$ APD:                         : chr "Off"
  ..$ Dynamic RT Shift (min):      : chr "0.00"
  ..$ Intens Comp Factor:          : chr "0.5601"
  ..$ Res. Dep. Intens:            : chr "1.000"
  ..$ CTCD NumF:                   : chr "0"
  ..$ CTCD Comp:                   : chr "0.863"
  ..$ CTCD ScScr:                  : chr "0.000"
  ..$ RawOvFtT:                    : chr "20505.9"
  ..$ LC FWHM parameter:           : chr "15.0"
  ..$ Rod:                         : chr "0"
  ..$ PS Inj. Time (ms):           : chr "0.320"
  ..$ AGC PS Mode:                 : chr "1"
  ..$ AGC PS Diag:                 : chr "2951067"
  ..$ HCD Energy eV:               : chr "14.175"
  ..$ AGC Fill:                    : chr "0.01"
  ..$ Injection t0:                : chr "0.000"
  ..$ t0 FLP:                      : chr "0.00"
  ..$ Access Id:                   : chr "0"
  ..$ Analog Input 1 (V):          : chr "0.000"
  ..$ Analog Input 2 (V):          : chr "0.000"
  ..- attr(*, "class")= chr "rawrrSpectrum"
 - attr(*, "class")= chr "rawrrSpectrumSet"
>

both is profile data, but the instrument still placed some kind of centroided stream, but no noise, resolution, ... info but maybe that is again in the preferred stream.

chscho commented 2 years ago

Dear Tobi

Thank you for your answers. Yes, I was indeed only running the following code snippet:

S <- rawrr::readSpectrum("/Users/tobiasko/Downloads/Lumos_PRM.raw", scan = 2)
str(S)

Here is my SessionInfo():

R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_1.0.7 rawrr_1.2.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.1    purrr_0.3.4         haven_2.4.3         lattice_0.20-44     colorspace_2.0-2    vctrs_0.3.8        
 [7] generics_0.1.1      yaml_2.2.1          utf8_1.2.2          blob_1.2.2          rlang_0.4.12        hexbin_1.28.2      
[13] pillar_1.6.4        withr_2.4.2         glue_1.5.0          DBI_1.1.1           tidyverse_1.3.1     bit64_4.0.5        
[19] dbplyr_2.1.1        protViz_0.7.0       modelr_0.1.8        readxl_1.3.1        lifecycle_1.0.1     stringr_1.4.0      
[25] munsell_0.5.0       gtable_0.3.0        cellranger_1.1.0    rvest_1.0.2         codetools_0.2-18    memoise_2.0.0      
[31] forcats_0.5.1       tzdb_0.2.0          fastmap_1.1.0       parallel_4.1.1      fansi_0.5.0         broom_0.7.10       
[37] Rcpp_1.0.7          readr_2.1.0         scales_1.1.1        backports_1.3.0     BiocManager_1.30.16 cachem_1.0.6       
[43] jsonlite_1.7.2      fs_1.5.0            bit_4.0.4           ggplot2_3.3.5       hms_1.1.1           stringi_1.7.5      
[49] grid_4.1.1          tools_4.1.1         magrittr_2.0.1      tibble_3.1.6        RSQLite_2.2.8       crayon_1.4.2       
[55] tidyr_1.1.4         pkgconfig_2.0.3     ellipsis_0.3.2      xml2_1.3.2          reprex_2.0.1        rawDiag_0.0.40     
[61] lubridate_1.8.0     assertthat_0.2.1    httr_1.4.2          R6_2.5.1            compiler_4.1.1    

I've also realized, that we actually always recorded our PRM data in profile mode only... Do you plan to adapt your tool in a way, that will extract the noise information independent of centroid/profile mode? - or is there a way I can make the current version report this information?

Also thank you for looking into the strange output from the Lumos.

Best, Christian

tobiasko commented 2 years ago

A short update regarding the strange key:value pair above. It looks like that's not a parsing error, but exactly what we get from the managed code/RawFileReader .NET assembly. What are your language/OS settings @luechtian ?

chscho commented 2 years ago

The Lumos is hooked to a Windows 10 Enterprise LTSC computer (Vers. 1809, OS build: 17763.2237). The language settings were:

tobiasko commented 2 years ago

hmmm. doesn't look too bad. You could try changing all regional data formats/settings (incl. decimal separators etc.) to English (US) and see if it makes any difference. We had issues with instrument PCs in the past that used non English (US) compliant settings.

chscho commented 2 years ago

Hi @tobiasko Thank you for your feedback. We will definitely try to run some acquisitions with all English (US) settings on that PC and check if this is the cause for this strange entry in the raw file. I'll keep you posted! Further, we would still be very much interested in extracting the noise information also from raw files acquired in profile mode. Do you plan to implement this in a future release of RAWrr? Best, Christian

tobiasko commented 2 years ago

I am pretty sure we will implement this in the near future.

tobiasko commented 2 years ago

@davidsbutcher @moritzmadern Is this a sound description of what the reduced profile mode means/does

https://www.sciencedirect.com/science/article/pii/B9780128140130000053

??? Could it be that this mode was only use for a short time/few instrument generations(s) to save disc space? My feeling is: More recent generations only offer full profile mode or centroided mode...but maybe this again only true for Orbitrap FTMS.

Greetings, Tobi

moritzmadern commented 2 years ago

Hi,

sorry for my prolonged absence (ultimately I got access to the noise values by using a C# script a colleague of mine wrote for me - and recent months have been really busy)

@tobiasko Thankfully I can now aswer your question completely, since I have been in communication with Thermo over these things. The answer is this: What people typically mean by measuring in "profile-mode" is actually "reduced profile-mode". This is because there actually exists a (rather hidden) setting to measure in so-called "full-profile" mode on some orbitrap mass spectrometers. In this full-profile mode, no noise-filtering is performed on the data - the consequence is that you can zoom in as much as you want in the m/z region of spectra and there will still be varying low intensity background signal at every point of the m/z axis. We looked into this by measuring some samples with this setting - they had a filesize about 25 GB whereas the "normal"/aka "reduced" profile-mode (note that this does not mean "centroided" - It is profile mode, but not full profile-mode) produced a filesize of about 2 GB.

To summarize: What people mean by "profile-mode" is "reduced profile-mode", which results in spectra that are filtered for low intensity noise. The original, unfiltered spectrum can not be reconstructed, however, the information of noise is kept as the "noise-values" we see in the raw data.

Best, Moritz

tobiasko commented 2 years ago

@moritzmadern All right! That is pretty valuable informative. Would you be willing to share your c# code? This could help us to develop a matching managed code for rawrr.

moritzmadern commented 2 years ago

Hi,

The C# source code is fully documentend in GitHub, see: https://github.com/fstanek/rawStallion

Note that all credits (as well as my big thanks!) go to Florian Stanek from the Research Institute of Molecular Pathology (IMP) in Vienna. I have never written a single line of code in C# in my life ; )

tobiasko commented 2 years ago

Cool. THX! We could also invite him to do a pull request, than his contribution is fully documented.

moritzmadern commented 2 years ago

I told him, if he has time he will try to help out (I think)

cpanse commented 2 years ago

@moritzmadern @tobiasko of note, this is a git@git.bioconductor.org:packages/rawrr mirror!

davidsbutcher commented 2 years ago

Hi all. Just checking to see if there has been any movement on this, as I would still love to see it added as a feature.

cpanse commented 2 years ago

Hoi @davidsbutcher, Can you please check if commit/ba5035d13c40563ee9e158f8ae65f8340af3ebea gets you further?

version 1.5.3.

https://bioconductor.org/packages/devel/bioc/html/rawrr.html

f <- "path to your raw file"
S <- f |> rawrr::readSpectrum(scan = 1)
S[[1]]$centroid.PreferredNoises 

C

davidsbutcher commented 1 year ago

Hoi @davidsbutcher, Can you please check if commit/ba5035d13c40563ee9e158f8ae65f8340af3ebea gets you further?

version 1.5.3.

https://bioconductor.org/packages/devel/bioc/html/rawrr.html

f <- "path to your raw file"
S <- f |> rawrr::readSpectrum(scan = 1)
S[[1]]$centroid.PreferredNoises 

C

Christian, I was able to confirm today that I am able to extract the preferredMasses and preferredNoises using version 1.5.3. Thanks for getting this feature added! It's very useful to my work.

cpanse commented 1 year ago

@davidsbutcher thank you for your feedback. If there is something else feel free to reopen or open a new one. C