sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
183 stars 80 forks source link

Trouble extracting chromatogram plots after manualChromPeaks() #716

Closed pablovgd closed 7 months ago

pablovgd commented 8 months ago

Hi all, me again :)

I just found the manualChromPeaks function in xcms, that allows to manually define chromatographic peaks, integrate the intensities within the specified peak area and add them to the object's chromPeaks matrix.

On first sight, everything seems to work fine:

> res <- manualChromPeaks(data_batch, pks)
> res
Object of class XcmsExperiment 
 Spectra: MS1 (22993) 
 Experiment data: 20 sample(s)
 Sample data links:
  - spectra: 20 sample(s) to 22993 element(s).
 xcms results:
  - chromatographic peaks: 4444 in MS level(s): 1 

> head(chromPeaks(res))
              mz     mzmin     mzmax        rt rtmin rtmax       into sample        maxo
CP0001 103.07533 103.07484 103.07588 510.20761 502.2 520.2    1263941      1    317976.2
CP0002 133.04950 133.04887 133.05021 253.54175 244.8 262.8    2862221      1    425176.2
CP0003 139.05017 139.04950 139.05090  86.32060  70.2  88.2   28472714      1  19922294.0
CP0004  90.05496  90.05451  90.05541  52.51372  39.0  57.0  463882888      1 144385264.0
CP0005 118.08625 118.08567 118.08685  85.34979  72.0  90.0  998694923      1 317063936.0
CP0006 132.10189 132.10125 132.10257 129.80821 119.4 137.4 5575732931      1 568556352.0

However, when I want to inspect the integration of the peaks in plots of the chromatograms, I run into issues.

When I extract the chromatograms using chromPeakChromatograms, the resulting object displays the following error:

> chroms <- chromPeakChromatograms(res)
> chroms
XChromatograms with 4444 rows and 1 column
Error in FUN(X[[i]], ...) : 
  invalid class “XChromatogram” object: chromPeaks matrix does not have all required columns: rt,rtmin,rtmax,into,maxo,sn

Merging neighboring peaks and performing peak grouping first also leads to the same error:


 >      mpp <- MergeNeighboringPeaksParam(expandRt = 4)
>       res_pp <- refineChromPeaks(res, mpp)

>   pdp <- PeakDensityParam(sampleGroups = sampleData(res)$sample_type,
+                               minFraction = 0.4, bw = 30)
>       res <- groupChromPeaks(res_pp, param = pdp)
[===================================================================================================================================] 100/100 (100%) in  0s
> res
Object of class XcmsExperiment 
 Spectra: MS1 (22993) 
 Experiment data: 20 sample(s)
 Sample data links:
  - spectra: 20 sample(s) to 22993 element(s).
 xcms results:
  - chromatographic peaks: 4444 in MS level(s): 1 
  - correspondence results: 175 features in MS level(s): 1

> feature_chroms <- featureChromatograms(res, features = 1:4)

> plot(feature_chroms)
Error in FUN(X[[i]], ...) : 
  invalid class “XChromatogram” object: chromPeaks matrix does not have all required columns: rt,rtmin,rtmax,into,maxo,sn

Seems to me like the results of manualChromPeaks don't get represented the same way in the "XChromatogram" object as results coming fromfindChromPeaks.

Any obvious solution for this? If not I might delve a bit deeper into this. Thanks a lot again!

Kind regards, Pablo

jorainer commented 8 months ago

ah, nice catch, thanks! My quick guess is that the we forget to add some columns with manualChromPeaks. Did your object already have some chrom peaks before adding additional ones manually?

I'll take a look at this later - alternatively, you're more than welcome to look into the problem, try to find a solution and make a pull request :)

pablovgd commented 8 months ago

ah, nice catch, thanks! My quick guess is that the we forget to add some columns with manualChromPeaks. Did your object already have some chrom peaks before adding additional ones manually?

I'll take a look at this later - alternatively, you're more than welcome to look into the problem, try to find a solution and make a pull request :)

No, it hasn't got any chrom peaks yet. I'll see what I can find to circumvent this error :)

pablovgd commented 7 months ago

Alright, I kinda found out what's going on. So compared to output from findChromPeaks, the sn column is missing when using manualChromPeaks. I've tested manually defining an 'sn' column for the object and then plotting does work:

chromPeaks(res_v2) <- cbind(chromPeaks(res_v2),c(2,2))
colnames(chromPeaks(res_v2))[length(colnames(chromPeaks(res_v2)))] <- "sn"
chroms_manual <- chromPeakChromatograms(res_v2)
plot(chroms_manual)

image

I'll try and find out where the adding of the sn column is supposed to happen and see if I can fix it. Any suggestions on where I might look best for that? I'm still trying to find my way around all the hidden functions etc. in xcms 😅

If I managed to find and implement a fix, I'll make a proper pull request to add the fix, I'm not really experienced in doing proper pull requests and code changes, so you'll have to bear with me 😶

Cheers, Pablo

pablovgd commented 7 months ago

I think I have implemented a fix, but it might be a bit bodge, So manualChromPeaks calls .xmse_integrate_chrom_peaks which calls .chrom_peak_intensity_centWave and that one doens't add the sncolumn to the result. For now, I rewrote the function so that it adds the sncolumn & calculates the snby dividing the max signal by the noise estimated by estimateChromNoise. So that probably needs replacing with a more proper way? I'll try to create a decent PR tomorrow.

The fix:

chrom_peak_intensity_centWave <- function(x, rt, peakArea,
                                           mzCenterFun = "weighted.mean",
                                           sampleIndex = integer(),
                                           cn = character(), ...) {
  cn = c(cn,"sn") #adding sn column
  res <- matrix(NA_real_, ncol = length(cn), nrow = nrow(peakArea))
  rownames(res) <- rownames(peakArea)
  colnames(res) <- cn
  res[, "sample"] <- sampleIndex
  res[, c("rtmin", "rtmax", "mzmin", "mzmax")] <-
    peakArea[, c("rtmin", "rtmax", "mzmin", "mzmax")]
  for (i in seq_len(nrow(res))) {
    rtr <- peakArea[i, c("rtmin", "rtmax")]
    keep <- which(MsCoreUtils::between(rt, rtr))
    if (length(keep)) {
      xsub <- lapply(x[keep], xcms:::.pmat_filter_mz,
                     mzr = peakArea[i, c("mzmin", "mzmax")])
      ## length of xsub is the number of spectra, the number of peaks can
      ## however be 0 if no peak was found. Maybe we should/need to
      ## consider adding 0 or NA intensity for those.
      mat <- do.call(rbind, xsub)
      if (nrow(mat)) {
        ## can have 0, 1 or x values per rt; repeat rt accordingly
        rts <- rep(rt[keep], vapply(xsub, nrow, integer(1L)))
        maxi <- which.max(mat[, 2L])[1L]
        mmz <- do.call(mzCenterFun, list(mat[, 1L], mat[, 2L]))
        if (is.na(mmz)) mmz <- mat[maxi, 1L]
        res[i, c("rt", "mz", "maxo", "into","sn")] <- c( #added sn colname
          rts[maxi], mmz, mat[maxi, 2L],
          sum(mat[, 2L], na.rm = TRUE) *
            ((rtr[2L] - rtr[1L]) / max(1L, (length(keep) - 1L))),
          mat[maxi, 2L]/ xcms:::estimateChromNoise(mat[, 2L]) #added sn calc 
        )
        if ("beta_cor" %in% cn)
          res[i, c("beta_cor", "beta_snr")] <- xcms:::.get_beta_values(
            mat[, 2L], rts)
      }
    }
  }
  res[!is.na(res[, "maxo"]), , drop = FALSE]
}
sneumann commented 7 months ago

Closed in #719