rickhelmus / patRoon

Workflow solutions for mass-spectrometry based non-target analysis.
https://rickhelmus.github.io/patRoon/
GNU General Public License v3.0
65 stars 19 forks source link

Swath acquisition #109

Open Boris-Droz opened 7 months ago

Boris-Droz commented 7 months ago

Hi Rick, I am wonder if patroon as a tool for swath acquisition deconvolution? If not do yo plan to integrate one tool? If not again do you have any strategy or thing that I can try to implement one.

Thank you very much

Boris

rickhelmus commented 7 months ago

Hi Boris,

This and DIA in general is on the 'long-term TODO list', so at the moment not really supported. We mainly use DDA data here, but I have gotten quite some DIA requests lately. OpenMS has OpenSWATH, which might make sense to integrate at some moment perhaps, but I didn't really study it deeply so far. There are more tools out there to align DIA data, but again I don't have experience with these yet.

I am always open to PRs though ;-)

Boris-Droz commented 7 months ago

Hi Rick,

That will be great to have such king of DIA Swath tools in the package are they are existing for OpenMS (a you say) and XCMS (we are more XCMS user here).

We are already looking forward to see the next version of the package.

Best

Boris

Boris-Droz commented 7 months ago

Hi Rick, Sorry for this re-openning. I just discussed with some colleagues about a project where I am involved and where we have swath data. So, do you think that it will be potentially an option to implement a XCMS reconstruction for MS2 into patRoon or outside PatRoon and then re-injecting the result in the workflow? Or should I look for a completely separate worflow?

Thank you for the answer

rickhelmus commented 6 months ago

Hi Boris,

Sorry from my late reply, just came back from holidays!

I gave it some more thoughts, and perhaps we can find a way to make use of the XCMS DIA data handling.

In principle, patRoon should be able to load all the MS/MS data, it's just not able (yet) to correlate the fragments to the right features. So I was thinking of the following approach:

  1. Perform a patRoon workflow and get the features as usual through XCMS.
  2. Convert the patRoon objects to XCMS objects with getXCMSnExp()
  3. Let XCMS do its magic for SWATH
  4. Get the peak lists with patRoon. I think you can just set precursorMzWindow to the SWATH window size. If this for some reason fails (please let me know), then you could also set it to NULL.
  5. Use the delete() function to remove any mass peaks not present in the XCMS spectra data.
  6. Continue with patRoon as usual.

The tricky part is step 5: here somehow we need to find the right spectral data for each feature group from patRoon, e.g. in pseudocode:

mslists <- delete(mslists, j = function(pl, grp, ana, type) {
    # pl is a single peak list (data.table with mz, intensity, ...)
    # grp is the name of the feature group
    # ana is the analysis (NULL if the peak lists is for the whole feature group)
    # type is MS or MSMS

    if (type != "MSMS")
        return(TRUE) # don't delete any peaks in MS data

    # some function that needs to be implemented, and will return a table from the XCMS DIA spectrum
    plXCMS <- getXCMSSpec(grp, ana)

    pl <- data.table::copy(pl)
    # mark all mass peaks that are within the XCMS spectrum as well (5 mDa tolerance)
    pl[, inXCMS := any(abs(mz - plXCMS$mz) <= 0.005)]

    return(pl$inXCMS == FALSE) # and remove all others
})

So what we need is to somehow cookup the getXCMSSpec() function from the example above. I have zero experience with XCMS/SWATH (thanks for letting me know it exists ;-), so perhaps you could give this an initial go?

Thanks, Rick

Boris-Droz commented 2 months ago

Hi Rick, Sorry for my slow reply I was busy with something else. I try your approach but already get an error on step 2 I try to find out any example into the handbook, but the handbook look similar of what I did.

My dummy standard code is:

## load data info
df <- read.csv(paste(workPath,"/input/",sample.list,sep=""),
               sep=",",header=TRUE)

anaInfo <- data.frame(cbind(path = df$path, # who contain all data sample info
                            analysis =df$filename,
                            group = df$group,
                            blank = df$blank) )

 -------------------------
# features
# -------------------------
# Find all features
param.xcms <- xcms::CentWaveParam(ppm = 9.2,
                                  peakwidth = c(13.9, 164),
                                  snthresh = 10,
                                  prefilter = c(3, 100),
                                  noise = 0 ) 

fList <- findFeatures(anaInfo, "xcms3", param = param.xcms) 
fList <- makeSet(fList, adducts = "[M-H]-")  
fGroups <- groupFeatures(fList, "xcms3")

xcms.fgroups <- getXCMSnExp(fGroups) # convert the object

get following error

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'obj' in selecting a method for function 'getXCMSnExp': argument "set" is missing, with no default

Why should I put in set ? Thank you very much for your help

Cheers

Boris

rickhelmus commented 2 months ago

Hi @Boris-Droz ,

In sets workflows, some functions have slightly different usage and parameters. You can recognize that in the reference manual by the bold marking (sets workflow) (see eg https://rickhelmus.github.io/patRoon/reference/xcms-conv.html). So, in this case you will need to specify the set name (typically "positive" or "negative"), as XCMS only supports one polarity.

To be honest, I would suggest to first try this in a regular workflow to simplify things a bit.

Thanks, Rick

Boris-Droz commented 2 months ago

Hi Rick,

Thank you Rick for your guidance. I will try this soon and be back to you in case of new issue. When you say "regular workflow" what do you mean exactly? My guess is using the setting you provide in the tutorial with plugging my data into it? Is that correct. Best Boris

Boris-Droz commented 2 months ago

Hi Rick,

I was able to move forward (performed successfully point 1 to 3). Thank you again.

But get stuck on how to get point 4. Get the peak lists with patRoon. I was guessing using generateMSPeakLists function. But this need a fGroups object. I get from XCMS a swath_spectra using the following protocol from XCMS. Any idea how to reinject my swath deconvoluted data into mslist to perfomed point 5 and 6?

Best

Boris

rickhelmus commented 2 months ago

Hi @Boris-Droz ,

In the approach I assume that you start with a 'regular' patRoon workflow to get features first. So you can use the fGroups from step 1 to call generateMSPeakLists(). At the same time, you use the same fGroups to export the feature data to an XCMS object so you can get the SWATH data.

In step 5 you use can use the delete() function to synchronize the peak lists peak lists obtained from patRoon with the SWATH data, i.e. by removing all the peaks that are not present in the XCMS SWATH data. You can use the delete() function in a similar way as we discussed in #95.

In step 6 you use the filtered peak lists (and fGroups from step 1) to continue the regular workflow.

Does that make more sense?

Boris-Droz commented 2 months ago

I think I get it.

Do you suggest me to put the deconvolution step within your pseudocode and work on this? if yes.

I have currently this pseudo code that gave me some output that I was waiting

param.xcms <- xcms::CentWaveParam(ppm = 9.2,
                                  peakwidth = c(13.9, 164),
                                  snthresh = 10,
                                  prefilter = c(3, 100) )

mslists <- delete(mslists, j = function(pl, grp, ana, type) {

  if (type != "MSMS")
    return(TRUE) # don't delete any peaks in MS data

  #  XCMS DIA spectrum deconvolution step 
  ## see http://bioconductor.riken.jp/packages/3.10/bioc/vignettes/xcms/inst/doc/xcms-lcms-ms.html#3_swath_data_analysis
  plXCMS <- getXCMSnExp(fGroups, ana, set="negative", loadRawData=TRUE)
  plXCMS <- findChromPeaksIsolationWindow(plXCMS, param = param.xcms)
  plXCMS <- reconstructChromPeakSpectra(plXCMS, expandRt = 0,
                                        diffRt =2, minCor = 0.8)

  pl <- data.table::copy(pl)
  # mark all mass peaks that are within the XCMS spectrum as well (5 mDa tolerance)
  pl[, inXCMS := any(abs(mz - plXCMS$mz) <= 0.005)]

  return(pl$inXCMS == FALSE) # and remove all others
})

I need to test it against a know sample standard to see how the deconvolution result are compare to something that I will expect.

Thank you for the guidlance, feel that I am getting close.

Best

Boris

Boris-Droz commented 1 month ago

Hi Rick, I am stuck again. I get this error with the function post above:

Error in [.data.table(pl, , :=(inXCMS, any(abs(mz - plXCMS$mz) <=  : 
  Supplied 923 items to be assigned to 25 items of column 'inXCMS'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

I try a couple of thing but not sure where exactly is the issue. The three step

 #  XCMS DIA spectrum deconvolution step 
  ## see http://bioconductor.riken.jp/packages/3.10/bioc/vignettes/xcms/inst/doc/xcms-lcms-ms.html#3_swath_data_analysis
  plXCMS <- getXCMSnExp(fGroups, ana, set="negative", loadRawData=TRUE)
  plXCMS <- findChromPeaksIsolationWindow(plXCMS, param = param.xcms)
  plXCMS <- reconstructChromPeakSpectra(plXCMS, expandRt = 0,
                                        diffRt =2, minCor = 0.8)

Performed whell outside patRoon, but my guess is that the re-injection into pl$inXCMS is consing me some error. Not sure what to test right now. Can you guide my effort. Thank you again for all.

rickhelmus commented 1 month ago

Hi Boris,

Few quick comments:

Boris-Droz commented 4 weeks ago

Hi Rick, Sorry to bother you but I try to get something working in the way you guide me but I am stuck. I get some MS2 swath reconstructed swath spectra. But can not figure out on how to performed a getXCMSSpec(grp, ana) function that able me to synchronise the MS2 from the swath to the mslist from patroon. Is there a way you could potentially help me?

Boris-Droz commented 4 weeks ago

Thank you so much in advance.