BelindaHernandez / bartBMA

4 stars 3 forks source link

tree_prior in get_best_split_sum #8

Open EoghanONeill opened 5 years ago

EoghanONeill commented 5 years ago

Lines 1202 and 1218 of BARTBMA_SumTreeLikelihood.cpp add together the prior probabilities of all the trees in the sum of tree model [and then tree_prior is included in the "BIC" in line 1226, presumably for a closer approximation to the posterior probability].

I'm not sure if it is correct to add the probabilities of the trees? Perhaps the probabilities should be multiplied?

But I don't know if the sum-of-tree priors are well defined if different numbers of trees are used in different sum of tree models. [The probability of no splits at all can be used for trees that are not included, but this would actually give the probability of tree stumps, not the probability of the absence of a tree]. For the prior to be well defined while allowing for different numbers of trees, a prior can be placed on the probability of a tree occurring at all or a prior can be placed on the number of trees].

Furthermore, perhaps the the tree prior should, in principle, also account for the probabilities of splitting variables and splitting points? In the standard BART model, these are uniform priors. Linero (2018) suggests a Dirichlet hyperprior on the prior probabilities of splitting variables. I don't know if the splitting point priors and splitting variable priors would lead to much improvement in the model weighting, and I don't know what the computational cost of including these priors would be.

Linero, A. R. (2018). Bayesian regression trees for high-dimensional prediction and variable selection. Journal of the American Statistical Association, 1-11.

Edit: I also note that there is a comment in get_best_split_sum "//at the moment tree prior is only for current tree need to get it for entire sum of tree list."

but tree_prior is the sum of priors for more trees than just proposal_tree.

EoghanONeill commented 5 years ago

I can think of three alternatives:

  1. Replace += with *= [and replace double tree_prior=0 with double tree_prior=1 in get_best_split_sum] and don't change anything else. This results in much greater prior probabilities being applied to models with one or a few trees. The prior might not be well defined. [I don't recommend this approach].

  2. Just consider models with a certain number of trees (and replace += with *= [and replace double tree_prior=0 with double tree_prior=1 in get_best_split_sum]). e.g. only consider models that are sums of 5 trees. This is the approach used in standard BART (Chipman et al. 2010). It might be necessary to remove some of the C++ code that keeps models that were obtained in previous rounds (i.e. rounds with less than the maximum number of trees). Chipman et al. (2010) also suggest that cross-validation can be used to determine the number of trees.

  3. Place a prior on the number of trees in the model (and replace += with *=[and replace double tree_prior=0; with double tree_prior=1 in get_best_split_sum]). The possibility of using this approach was noted by Chipma et al. (2010). However, I can't find any examples of this approach or suggestions of priors for the number of trees. A simple prior would put a probability (e.g. 0.5) on the inclusion of each tree up to some number of trees (e.g. 0.5), but it might be preferable to directly specify the prior probabilities of different numbers of trees. For example, it might be preferable for 1, 2, or 3 trees to be less probable than 4 or 5 trees.