rietho / IPO

A Tool for automated Optimization of XCMS Parameters
http://bioconductor.org/packages/IPO/
Other
34 stars 20 forks source link

Determining the number of xcms runs and DoEs for an optimization run #59

Open etrh opened 6 years ago

etrh commented 6 years ago

I read in issue #39 that for the optimization of three parameters, xcms is called 17 times in each DoE.

How is the number of xcms runs calculated according to the parameters? Where does that 17 come from for 3 parameters? And how do we determine the number of required DoE's for a run?

For example, if I have the following set-up:

peakpickingParameters <- IPO::getDefaultXcmsSetStartingParams('centWave')
peakpickingParameters$min_peakwidth <- c(4,10)
peakpickingParameters$max_peakwidth <- c(15, 35)
peakpickingParameters$ppm <- c(10,30)
peakpickingParameters$snthresh <- c(2,100)
peakpickingParameters$noise <- 6000
peakpickingParameters$prefilter <- c(3,10)
peakpickingParameters$value_of_prefilter <- c(3000,30000)
peakpickingParameters$fitgauss <- TRUE

How many xcms runs will I have for each DoE and how many DoEs will be run in total?

rietho commented 6 years ago

That's a good question. I'll give you a brief explanation on underlying questions and hope that this answers your question.

What kind of DOE does IPO rely on? IPO uses as DOE the central composite design (CCD). Here's the link to the documentation with a reference article.

How does that determine the number of runs per DOE? Here's a formula for the runs per DOE image (the words are german, sorry for that. translation: Aufrufe = runs, für = for) So in your case this should actually be 16 (if my formula is right and thus my statement in #39 is slightly of).

Notes about the two parts of the formula:

  1. This link describes the part within the parentheses. The additional +1 is due to one additional run on the optimal point found.
  2. It was a design decision to have 9 runs equally distributed from min to max of the single parameter

How many DOEs does IPO run? IPO stops the optimisation, if it can't find any improvements based on the DOE process. So that's totally problem dependent. However, there's a maximum limit of 50 runs implemented.

@etrh I'll leave the calculation of the number of runs for your setup to you ;)