Closed zinuo-H closed 1 year ago
Thanks for this detailed explanation and description - I try to answer as much as possible, as there are several things here:
it is generally suggested to use register(SerialParam())
to fix errors. With parallel processing the error messages can be (also depending on the operating system) misleading (because the error message might be hidden in between some other error messages coming from the parallel processing). This is why it seems that you get different errors.
the error "Found gaps in scan time" means that for that particular mzML file the difference in retention times between consecutive spectra is larger than expected/tolerated. And the std::bad_alloc
message comes directly from the C++ code of obiwarp. So, something in the C++ code of obiwarp fails - and unfortunately that is not something I can fix.
It seems you have blanks and study samples included. Maybe a subset-based alignment might be better in your case? See this section for more information. Generally, if you have QC samples measured at regular intervals in your experiment, this type of alignment might be better suited, as retention time shifts are estimated based on the QC samples (same sample measured repeatedly, so the same peaks/signal can be expected) and the samples measured in between the QC samples are then adjusted based on these.
Finally, maybe try to use the peak groups alignment (PeaksGroupsParam
) instead of obiwarp for these specific samples/files? I found that method to be more robust than obiwarp - and configuration is also easier and more intuitive.
Also, what version of xcms
did you use for your analysis? Some errors/problems with obiwarp were fixed, somaybe updating the version might also help.
I hope this reply can help you finding a solution for your problem.
Thank you so much for the speedy reply! Your answer help me a lot!
I try to use the peak groups alignment (PeaksGroupsParam
) instead of obiwarp for this dataset, then run successfully!
But now I have a new question about these two alignment algorithm here:
Perhaps you noticed that the files I asked about earlier were collected in negative mode. However, when running these samples' positive mode data, there are some problems as well:The positive mode data were able to run successfully using the Obiwarp
alignment algorithm, but ended up with only 65 FEATURES in the output table (This seems very unreasonable! These samples perform very well in the TIC plot and should not have detected only so few peaks). And this time, I try to use PeaksGroupsParam
instead of obiwarp
for these positive mode data files, the new output table contains 3669 FEATURES.
I wonder why just modifying the peak alignment algorithm caused such a big difference in the results? (65 FEATURES to 3669 FEATURES). This confuses me and I don't know how to choose the appropriate peak alignment algorithm in the future (obiwarp seems to be a more commonly used algorithm). I'd like to hear your thoughts and advice on this situation!
Again, thank you and your answer made my day!
### Peak picking
paramPeakPick <- CentWaveParam(snthresh = 10, noise = 300, peakwidth = c(4, 50),ppm=10)
xdata <- findChromPeaks(raw_data,param = paramPeakPick)
pdp <- PeakDensityParam(sampleGroups = xdata$sample_group, bw = 0.25, minFraction = 0.77, minSamples = 1, maxFeatures = 50)
xdata <- groupChromPeaks(ipo_xdata,param = pdp)
pgp <- PeakGroupsParam( minFraction = 1 ) xdata <- adjustRtime(xdata, param = pgp)
pdp <- PeakDensityParam(sampleGroups = xdata$sample_group, minFraction = 0.10) xdata <- groupChromPeaks(xdata, param = pdp)
xdata <- fillChromPeaks(xdata, param = ChromPeakAreaParam())
feature.info <- featureDefinitions(xdata) feature.intensity <- featureValues(xdata,method = "medret",value = "into",intensity = "into", filled = TRUE,missing = NA) feature.table <- merge(feature.info,feature.intensity,by = 0, all = TRUE, Row.names = TRUE) feature.table <- feature.table[, !(colnames(feature.table) %in% c("peakidx"))]
write.table(feature.table, "xcms_all.csv", sep = ",", quote = FALSE, row.names = FALSE)
I would suggest you to visualize the alignment results of both algorithms using the plotAdjustedRtime
function. Could well be that with one algorithm you have stronger adjustments then with the other. Here you need to decide what makes most sense for your experiment and data file: how strong do you think would the retention time shifts be in your data?
The alignment results should be somehow reasonable, otherwise you end up assigning signals from different compounds to the same feature...
Maybe this tutorial could also give you some hints how you can check results and find settings/parameters.
Thanks for your guidance! I try to use both plotAdjustedRtime
and plot(chromatogram(xdata, aggregationFun = "max", include = "none"))
functions to visualize the alignment results. Then I got the plot:
Obiwarp
algorithm:
PeakGroups
algorithm:
It is obvious that the PeakGroups
algorithm gives a more reasonable result!
Thank you so much! My problem has been resolved, and your response has been a great help to me!
Thanks for reporting back and closing the issue!
Thanks for your guidance! I try to use both
plotAdjustedRtime
andplot(chromatogram(xdata, aggregationFun = "max", include = "none"))
functions to visualize the alignment results. Then I got the plot:
- The raw base peak chromatogram:
- Result of
Obiwarp
algorithm:- Result of
PeakGroups
algorithm:It is obvious that the
PeakGroups
algorithm gives a more reasonable result! Thank you so much! My problem has been resolved, and your response has been a great help to me!
Hi
May I ask how to plot the result of "PeakGroups" algorithm? Thanks very much.
Best, Muyao
The BPC were extracted with the chromatogram(data, aggregationFun = "max", chromPeaks = "none")
function (assuming the object with the xcms preprocessing result is called data
), the results from the alignment step were plotted using the plotAdjustedRtime(data)
function.
I have successfully processed datasets from various vendors of LC-MS/MS instruments with xcms' R packages. However, this time I failed to process the LC-MS/MS data obtain from sciex X500B. The error occur when processing
adjustRtime()
function. If I skip this step, everything seems goes well.I think there may be some problem with my importing mzML files (convert from .wiff format by MSConvert). If I only choose serval of mzML file import (e.g. only import QC files), the project contain
adjustRtime()
function could run successfully. What's puzzling is that when I add different mzML files, I also get different error reports.The original code I used:
Then I got the first error:
It seems like this error related with BiocParallel packages. So I add
register(SerialParam())
command beforexdata <- adjustRtime(xdata, param = ObiwarpParam())
then try agian, however this time the error is different from the previous one:The
std::bad_alloc
seems error indicates that there was a memory allocation problem while running the R script. However, I am sure that I have allocated enough memory for this project (Memory Efficiency: 0.61% of 2.00 TB ). (This task cannot be run on a PC because it will cause the R session to close directly)As I mention before, this full task can run successfully if I only import QC file. I tried to run the task after deleting the first two files and got a new error:
See, some of the files succeeded, the mzML file named 'chao' failed, and a new error appeared!
I don't know what's happening and where the error is coming from, is it a problem with my files? Is it an error when generating the mzML file? The original mzML files is as follows:
Link:https://drive.google.com/drive/folders/1_q6M5aQzxX0HrBnjeBM-tJcZse3vIoUR?usp=share_link
I hope I can get an answer, thanks!