Open rromoli opened 5 years ago
You are on the right track @rromoli . The readSRMData
actually returns a Chromatograms
object (the same type of object you would get by calling chromatogram
on an MSnExp
/OnDiskMSnExp
object containing spectra data).
You should be able to directly call findChromPeaks
on the mrm
object you have. This will return you a XChromatograms
object (defined in xcms
) that contains then also the identified chromatographic peaks (which you can access with chromPeaks
). Note that I've also implemented a groupChromPeaks
method for XChromatgrams
, but no adjustRtime
method.
Regarding alignment, I've implemented only an alignment method that allows to align a single Chromatogram
object against another one - but nothing yet for Chromatograms
(note the s).
findChromPeaks()
works fine using MatchedFilterParam()
, but I noticed that the
fwhm parameter have a strange behaviour. I need to divide the value 10 times. So
the value I used is fwhm/10 (0.4) otherwise it integrate too much base line...
Furthermore I do not understand how to extract data. I mean:
> featureValues(peaks, value = "into")
1 2 3 4 5 6 7
FT01 NA NA NA 271.50840 225.47668 201.51727 862.14525
FT02 NA NA NA NA NA NA 215.32279
FT03 140.49033 107.46131 87.04114 143.76471 118.99295 113.02374 186.81566
FT04 53.75822 48.75521 42.22618 58.85781 55.24559 46.26336 109.39015
FT05 NA NA NA 30.84873 NA NA 58.27840
FT06 NA NA NA NA NA NA NA
FT07 NA NA NA 46.28275 68.67077 63.43596 143.92853
FT08 NA NA NA NA NA NA NA
FT09 NA NA NA NA NA NA NA
FT10 NA NA NA NA NA NA 68.55487
In this way I extract the integrated signals but I have no idea what FTXX stand for.
If I use the precursorMz()
and productMz()
functions I see that I have 8 SRM
transitions in my dataset. Why in the results I have 10 features? I try to use featureDefinitions()
> featureDefinitions(peaks)
DataFrame with 10 rows and 15 columns
mzmed mzmin mzmax rtmed rtmin
<numeric> <numeric> <numeric> <numeric> <numeric>
FT01 NA NA NA 1.64795005321503 1.62013328075409
FT02 NA NA NA 1.66193330287933 1.64795005321503
FT03 NA NA NA 4.00483322143555 3.76771664619446
FT04 NA NA NA 11.7689828872681 11.7496662139893
FT05 NA NA NA 9.60551643371582 9.58619976043701
FT06 NA NA NA 10.1463832855225 10.1463832855225
FT07 NA NA NA 9.60551643371582 9.58619976043701
FT08 NA NA NA 10.1463832855225 10.1463832855225
FT09 NA NA NA 10.1463832855225 10.1270666122437
FT10 NA NA NA 10.1270666122437 10.1270666122437
but the function return no mz values.
How can I interpret the results?
Actually, you're the first user of this functionality! I've never analyzed MRM data (or had any MRM files available for testing). The FTXX is just an arbitrary feature identifier. The whole functionality works in a similar way as if you had LC-MS data, it does first chromatographic peak detection separately for each chromatogram (MRM) and then it uses the chromPeaks
matrix to group peaks across samples. I could imagine that you have more features than MRM because maybe in some of the chromatograms more than one peak was identified?
would it be possible for you to share some files with me so that I could look into what's happening?
I could imagine that you have more features than MRM because maybe in some of the chromatograms more than one peak was identified?
Yes, it seem that I have two interfering ions...
would it be possible for you to share some files with me so that I could look into what's happening?
Yes of course, how can we share? If you give to me your email I will share it with gdrive.
Thanks for the data! To get the information about the transision for the individual features you can do the following (variable peaks
is your Chromatograms
object after peak detection and correspondence analysis):
fdev <- featureDefinitions(peaks)
fdev <- fdev[, colnames(fdev) != "peakidx"]
fdev
DataFrame with 10 rows and 14 columns
mzmed mzmin mzmax rtmed rtmin
<numeric> <numeric> <numeric> <numeric> <numeric>
FT01 NA NA NA 1.64795005321503 1.62013328075409
FT02 NA NA NA 1.66193330287933 1.64795005321503
... ... ... ... ... ...
FT09 NA NA NA 10.1463832855225 10.1270666122437
FT10 NA NA NA 10.1270666122437 10.1270666122437
rtmax npeaks P0 P1 P2 P3
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
FT01 1.67591667175293 15 0 3 3 3
FT02 1.68990004062653 12 0 0 3 3
... ... ... ... ... ... ...
FT09 10.1463832855225 7 0 0 1 3
FT10 10.1463832855225 12 0 0 3 3
P4 P5 row
<numeric> <numeric> <integer>
FT01 3 3 1
FT02 3 3 2
... ... ... ...
FT09 0 3 7
FT10 3 3 8
In the featureDefinitions
there is a column "row"
that tells you in which of the rows (transitions) the feature was defined. You can add the actual precursor and product m/z with:
## Add the precursorMz and productMz to the annotation.
fdev$precursorMz <- rowMeans(precursorMz(peaks))[fdev$row]
fdev$productMz <- rowMeans(productMz(peaks))[fdev$row]
And to get the feature intensities:
fvals <- featureValues(peaks, value = "into")
Each row in fdev
provides now the feature annotations for the corresponding row in fvals
.
Hope it is a little clearer now. Let me know if not.
Hi @rromoli , if Johannes' suggestion works for you, it would be great if you could turn that into an MRM vignette. For that we'll need representative data (but could also be measurements of QC samples, no science required), and the script plus some explanations with it. Would that make sense ? Yours, Steffen
Ok @sneumann I will try to write a vignette about the use of xcms with MRM data!
Ok @sneumann I will try to write a vignette about the use of xcms with MRM data!
Hi, @rromoli may I know if you solve the mrm data import issue by now?
Thanks for the data! To get the information about the transision for the individual features you can do the following (variable
peaks
is yourChromatograms
object after peak detection and correspondence analysis):fdev <- featureDefinitions(peaks) fdev <- fdev[, colnames(fdev) != "peakidx"] fdev DataFrame with 10 rows and 14 columns mzmed mzmin mzmax rtmed rtmin <numeric> <numeric> <numeric> <numeric> <numeric> FT01 NA NA NA 1.64795005321503 1.62013328075409 FT02 NA NA NA 1.66193330287933 1.64795005321503 ... ... ... ... ... ... FT09 NA NA NA 10.1463832855225 10.1270666122437 FT10 NA NA NA 10.1270666122437 10.1270666122437 rtmax npeaks P0 P1 P2 P3 <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> FT01 1.67591667175293 15 0 3 3 3 FT02 1.68990004062653 12 0 0 3 3 ... ... ... ... ... ... ... FT09 10.1463832855225 7 0 0 1 3 FT10 10.1463832855225 12 0 0 3 3 P4 P5 row <numeric> <numeric> <integer> FT01 3 3 1 FT02 3 3 2 ... ... ... ... FT09 0 3 7 FT10 3 3 8
In the
featureDefinitions
there is a column"row"
that tells you in which of the rows (transitions) the feature was defined. You can add the actual precursor and product m/z with:## Add the precursorMz and productMz to the annotation. fdev$precursorMz <- rowMeans(precursorMz(peaks))[fdev$row] fdev$productMz <- rowMeans(productMz(peaks))[fdev$row]
And to get the feature intensities:
fvals <- featureValues(peaks, value = "into")
Each row in
fdev
provides now the feature annotations for the corresponding row infvals
.Hope it is a little clearer now. Let me know if not.
Hi @jorainer , I wonder if there are mrm data processing functions inside xcms now?
There's nothing specifically for MRM data, except that you can read the data as a MChromatograms
object and then perform chromatographic peak detection in each chromatogram (using the findChromPeaks
function), you can also perform a correspondence analysis (using groupChromPeaks
). In addition there is functionality to filter, plot and subset the chromatographic data.
There's nothing specifically for MRM data, except that you can read the data as a
MChromatograms
object and then perform chromatographic peak detection in each chromatogram (using thefindChromPeaks
function), you can also perform a correspondence analysis (usinggroupChromPeaks
). In addition there is functionality to filter, plot and subset the chromatographic data.
Best regards, Junjie
That is actually a problem with mzR
and more recent versions of proteowizard. Maybe try with the suggestions from this issue https://github.com/lgatto/MSnbase/issues/551 . In the longer run we hope to manage updating mzR
to include a newer version of proteowizard, but at present the workaround is to skip some data in the msconvert
conversion to mzML files.
D:\proteowizard>msconvert test.RAW --chromatogramFilter "index [2,]" format: mzML m/z: Compression-None, 64-bit intensity: Compression-None, 32-bit rt: Compression-None, 64-bit ByteOrder_LittleEndian indexed="true" outputPath: . extension: .mzML contactFilename: runIndexSet:
spectrum list filters:
chromatogram list filters: index [2,]
filenames: test.raw
processing file: test.raw calculating source file checksums writing output file: .\test.mzML
mrm <- readSRMData(fls2) Error: Can not open file D:\zhengjie_project\MRM_pipline\test.mzML! Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Unknown binary data type. mrm_cmd <- readMSData(fls2) Error: Can not open file D:\zhengjie_project\MRM_pipline\test.mzML! Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Unknown binary data type.
and this is my converted .mzMLfile. test.zip
Regards, Junjie
Seems that the converted file only contains a single chromatogram entry - which is the TIC (with the "non-standard data array" in it) - I guess the original file contains more chromatograms?
We'll try to update the mzR
package to include the new proteowizard code base - that should solve all problems but I can not guarantee when it will be available.
The developmental mzR
version with an updated proteowizard code is available. With this version it should be possible to read the mzML files. It might take some time until this version becomes "stable" because we had to remove the ramp
backend and hence mzData support. To install:
BiocManager::install("sneumann/mzR", ref = "feature/updatePwiz_3_0_21263")
Noted with thanks!
The developmental
mzR
version with an updated proteowizard code is available. With this version it should be possible to read the mzML files. It might take some time until this version becomes "stable" because we had to remove theramp
backend and hence mzData support. To install:BiocManager::install("sneumann/mzR", ref = "feature/updatePwiz_3_0_21263")
@jorainer hi, I also curious about how to achieve the peak alignments for the mchromatograms object successfully. My mrm data was imported by readSRMData, which resulted in Mchromatograms format. Therefore, I was not able to do the alignments for my data.
There is no alignment method as we have for XCMSnExp
(i.e. spectra data) available for the chromatographic data. What is available is the findChromPeaks
method that allows to identify chromatographic peaks and then also the groupChromPeaks
method to group chromatographic peaks across samples (have a look a the XChromatograms
help for more details ?XChromatograms
).
The only alignment method which is available for MChromatograms
is alignRt
which allows to align an MChromatograms
(i.e. chromatographic data across multiple samples) against a single Chromatogram
object. But I'm not sure if that's what you're looking for.
Hi Jorainer, I found some issues after I made peak picking on the chromatogram object of MRM data read by readSRMData.
Please kindly find my example data herein. E4-1.zip
Could you please add here also the R code you used to perform this analysis. Without that it's impossible to replicate and find out what your problems might be.
Hi, thanks a lot for your reply! Please kindly find the attached code herein:
std <- "E4-1.mzML" std1 <- readSRMData(std) chr1 <- std1[1,] mfp <- MatchedFilterParam( binSize = 0.1, snthresh = 0, ) xchr1 <- findChromPeaks(chr1, mfp)
Hi Jorainer, I found some issues after I made peak picking on the chromatogram object of MRM data read by readSRMData.
- peaks(y) after alignment function alignRT ended up as the copy chromatogram of example chromatogram(x)
- findChrompeak function with "MatchedFilterParam" was not able to detect peaks correctly on my data and failed to pick up two peaks in one chromatogram object.
Please kindly find my example data herein. E4-1.zip
I find out the way to pick up small side picks by adjusting the fwhm value (from 0~5) for my first questions. I am still trying to find out a good way to solve the second question.
Since the peaks are quite different (the first one broader the second quite narrow) I would suggest to use centWave instead of matchedFilter:
cwp <- CentWaveParam(peakwidth = c(1, 4))
tmp <- findChromPeaks(chr1, param = cwp)
plot(tmp)
this identifies both peaks:
Thanks a lot!!
Hi @jorainer , regarding my first question about the retention time correction. May I know if there is any way to get a modified function for retention time correction for mrm data?
At present we don't have a dedicated function to do a retention time alignment on MRM data (similar to what is available for spectra-based LC-MS data). For chromatograms with a single peak it should in theory also suffice to use a rather large bw
parameter in groupChromPeaks
with PeakDensityParam
which will then also group chromatographic peaks into the same feature even if their retention times are different.
We might implement some functionality, but at present we unfortunately don't have the capacity/manpower to do that. What would however help later is to get hands on example MRM data files with peaks that need to be aligned...
My chromatograms come with multiple peaks. I wish to make an alignment across samples before I group any peaks and continue with the downstream analysis. Currently, I try to find a workaround for this issue. Thanks for your help too!
Hi all,
I want to share my experience with SRM data and xcms:
An assay on a QqQ creates SRM data with 30 transitions. Two of them detect two isobaric, closely eluting compounds. The attached ZIP file contains a RDS file of those two transitions as MChromatograms
.
If I plot this I get:
graph.pdf
Then I do
xdata<-findChromPeaks(srm_selected[8,6], param = cwp)
and
chromPeaks(xdata) rt rtmin rtmax into intb maxo sn [1,] 7.302067 6.424583 8.231167 126.534 31.3895 207.1527 32 [2,] 14.941333 13.547683 15.818817 3968.052 3849.5452 14944.2097 5002
shows that the two large peaks have been detected as one wide peak at 14.94 min.
Doing the peak detection with MatchedFilterParam
shows the same behaviour. I've tried around but can not find settings for either that would detect the two peaks as individuals.
Now if I use do_findPeaks_MSW
I get both peaks as individuals:
int<-intensity(srm[2,]) rt<-rtime(srm[2,]) do_findPeaks_MSW(rt,int,snthresh = 1,scales=1:10) mz mzmin mzmax rt rtmin rtmax into maxo sn intf maxf [1,] 14.27032 14.11547 14.37355 -1 -1 -1 28765.37 12087.74 35.86637 NA 12216.63 [2,] 14.94133 14.78648 15.04457 -1 -1 -1 40612.61 14944.21 49.98661 NA 17026.19
Peak apex and boundaries are well enough defined.
I am wondering now: MSW and centwave both use the MassSpecWavelet
functionalities. Why are they delivering such different results. Using findChromPeaks
with centwave would be so much more comfortable on MChromatograms
but I think I can do with do_findPeaks_MSW
.
Cheers Andreas
Forget what I wrote above. Reading through some other issues I realized that my SRM data is loaded with rtime in minutes. Thus using peakwidth(cwp)<-c(1,10)
is way too large. With peakwidth(cwp)<-c(0.017,0.17)
I do get individual peak detection.
My bad :-)
Of course, it would help if readSRMData
and readMSData
would behave identical in scaling the run time axis. Reading the same mzML files readSRMData
puts out minutes and readMSData
seconds. For whom is this an issue: @jorainer or @lgatto ?
That's interesting @breidan , I was not aware that you get different units from readSRMData
or readMSData
- would it be possible to provide one example file? in the end this should go to mzR
because we're using mzR
(which uses protepwizard) to read mzML files.
@jorainer,
attached is a zip of a mzML file of a SRM acquisition on an Agilent QqQ. This is one of the files that were read in for the MChromatograms
object in the zip file above. Rtime scale seconds with readMSData
and minutes with readSRMData
.
Thanks for sharing the file. So, you're right. the time is provided in minutes within the file and for the retention time of the spectra the mzR
/proteowizard C++ code is converting any provided time into seconds. For the chromatographic data I could not find a way to easily identify in which unit the retention time is provided and how that can be automatically converted to seconds (if not already provided as seconds). Any help on this (in the mzR
package) would be highly welcome...
I'm working with MRM data aquired on a Waters instrument and exported into a .mzML format. I try to import them using xcms using the
readMSData()
function with no luck:Seems that
readMSData()
is not able to correctly import MRM data. I read on MSnbase manual about thereadSRMData()
function to import MRM/SRM data. It seems to work and correctly import my data:I would like to manipulate (integrate, align) data using xcms but it seems
readSRMData()
class is not compatible with xcms functions:Is there a way to work with
readSRMData()
withxcms()
?