gluc / data.tree

General Purpose Hierarchical Data Structure for R
http://gluc.github.io/data.tree
206 stars 41 forks source link

sensitivity analysis in a decision tree #26

Open vnijs opened 8 years ago

vnijs commented 8 years ago

Hi @gluc

If you want to do some sensitivity analysis on a decision tree you could manually change some of the probabilities and/or payoffs in the yaml file and then rerun the analysis. However, because probabilities for chance nodes are related this is likely to lead to (input) errors, especially if a subtree is repeated several times.

Taking your jennylind example I am using at: https://internal.shinyapps.io/vnijs/quant/?SSUID=b04da779d8 it would be useful to do allow a user to provide input like the below in a text input:

Small Box Office <- c(.1, .2, .3) Median Box Office <- 1 - Small Box Office

I'd pass these values to a function that would loop through the input values, update the relevant values in the Node object, and then re-calculate the tree for each value.

Ultimately I'd like to use something like expand.grid so multiple inputs could be easily updated across a range of values (e.g., payoffs for Small Box Office from 200K to 400K in 50K increments together with changing probabilities to generate something like a spider plot).

I think it would be reasonably straightforward if I could use something like ToDataFrameTaxonomy, update the table, and then convert back to a Node object. However, I doubt that would be very efficient.

Any suggestions on how to best update specific Node values in an iterative fashion for sensitivity analysis?

Thanks

gluc commented 8 years ago

Got it, and makes sense! I have no time right now to try it out, but my thinking goes along these lines:

  1. you can already now decorate a Node with a function (e.g. via Set), and fetch the calculated value with Get. There is an example of this in the latest data.tree vignette in the dev branch. So, you wouldn't have to recalculate actively.
  2. you could also store a vector instead of a single value as a Node attribute.

So, you could do, say:

smallBoxOffice$p <- c(0.1, 0.2, 0.2)
medianBoxOffice$p <- function(self) 1 - self$parent$SmallBoxOffice$p

This is just a sketch of course, and you still need a syntax to let the user define that. But as a building block it's quite cool actually.

For the syntax, what would be really cool is to build that into the yaml definition. So you would have like:

name: Jenny Lind
type: decision
Sign with Movie Company:
    type: chance
    Small Box Office:
        p: [0.1, 0.2, 0.2]
        payoff: 200000
    Medium Box Office:
        p: "function(self) 1 - self$parent$`Small Box Office`$p"
        payoff: 1000000
    Large Box Office:
        p: 0.1
etc.

So then, all that's needed is to make sure the function is evaluated, which sounds achievable.

The second part of your question, the expand.grid part, I'm sure there is a nice solution to this, let me think about that for some time.

vnijs commented 8 years ago

Thanks for the input @gluc. I used Set to specify a new probability (ps). Since there are multiple instances of 'Small Box Office' I used filterFun as shown in the vignette. However, the same approach doesn't seem to work with a function (see below).

ps <- .4
object$jl$Set(ps, filterFun = function(x) x$name == "Small Box Office")

pf <- function(self) 1 - self$parent$SmallBoxOffice$ps
object$jl$Set(pf, filterFun = function(x) x$name == "Medium Box Office")

Maybe it would be easiest to have the user specify p <- c(.5, .3., .2) and then assign that to all sets of Small/Medium/Large Box Office (see below). I could generate different input vectors and recalculate the tree's EV for each one.

p <- c(.5, .3, .2)
object$jl$Set(ps, filterFun = function(x) x$name %in% c("Small Box Office","Medium Box Office", "Large Box Office"))
gluc commented 8 years ago

Only a quick note on the function assignment: I believe that was only possible in the dev branch. However, that's now pushed to the master, so it should work now, and there's an example in the data.tree vignette. Regarding the assignment of a vector: using the Set, that won't work as of now, because Set tries to recycle the vector and assign a single value per Node. However, it should work using Do, e.g. along the lines of:

p <- c(.5, .3, .2)
object$jl$Do(function(node) node$ps <- p, 
             filterFun = function(x) x$name %in% c("Small Box Office","Medium Box Office", "Large Box Office"))
gluc commented 8 years ago

29 would also be helpful to get the various probabilities in the sensitivity analysis, which is now not possible.

gluc commented 8 years ago

Is this still something you'd like to do? Got some ideas how to do sensitivity analysis in a very generic way but might need some feedback at a later stage.

vnijs commented 8 years ago

I'm definitely interested! Just haven't had time to work on it. Happy to help out in any way.

gluc commented 8 years ago

@vnijs : Check out https://github.com/gluc/ahp I've implemented many of the above ideas for the Analytic Herarchy Process. Might be time for a dedicated DecisionTree package ;-)

glennmschultz commented 8 years ago

Hi Christoph, Yeah, that sounds awesome! I have a question can I create siblings in data tree?

Glenn

Sent from my iPhone

On Jan 15, 2016, at 6:51 AM, Christoph Glur notifications@github.com wrote:

@vnijs : Check out https://github.com/gluc/ahp I've implemented many of the above ideas for the Analytic Herarchy Process. Might be time for a dedicated DecisionTree package ;-)

— Reply to this email directly or view it on GitHub.

gluc commented 8 years ago

@glennmschultz Not directly. I assume you want to add it at a specific position. Good idea! I opened #41

glennmschultz commented 8 years ago

Hi Christoph,

Yes, in some structures like PAC/Companion one can think of the companion bond as sibling or a child.  However, consider a companion whose cash flow is not directed to a tranch per say by split into a floater and inverse floater which receive principal pro-rata these could be represented as child nodes as well but I think they are better represented a sibiling.  I suppose I can create more than one child at the same position and these would be sibilings, correct?  Does data.tree allow more than one child at a given position?

Glenn

On Jan 15, 2016, at 07:56 AM, Christoph Glur notifications@github.com wrote:

@glennmschultz Not directly. I assume you want to add it at a specific position. Good idea! I opened #41 — Reply to this email directly or view it on GitHub.

gluc commented 8 years ago

@glennmschultz Yes, absolutely, it's possible, check out e.g.

library(data.tree)
data(acme)
acme
acme$AddChild("Marketing")$AddChild("Web")$parent$AddChild("Print")
acme

What I would like to add is the possibility to add a sibling just after a node. Let's continue the discussion in #41

vnijs commented 8 years ago

@gluc I added basic sensitivity analysis for decision analysis to Radiant. Documentation and install instructions below. My main goal was to keep user input as simple as possible. Would be happy to hear any comments/suggestions.

https://radiant-rstats.github.io/docs/model/dtree.html https://radiant-rstats.github.io/docs/install.html

gluc commented 8 years ago

@vnijs awesome, will check it out!