timmens / causal-forest

Implements the Causal Forest algorithm formulated in Athey and Wager (2018).
MIT License
65 stars 12 forks source link

Add link between parent and child node, and restructure `fitcausaltree` output. #1

Closed timmens closed 4 years ago

timmens commented 4 years ago

The output of the functions _fit and fitcausaltree is non-optimal and lacking a connection between parent and child nodes. This makes visualization and prediction using the fitted tree impossible.

I aim to implement two things.

  1. Add functionality to output of functions _fit and fitcausaltree such that every tree node has an unique identification number and knows the identification number of its parent node.
  2. Implement a function which scans the fitted tree array and adds to each node the identification number of its left and right child.

With these features implemented it should be easy to implement further functions, e.g. prediction.

timmens commented 4 years ago

I've constructed a function that assigns each tree node a unique identification number (integer) given the identification number of its parent and the level. As this is a bijective function we know directly the parents and children of a given node. However, during execution I noticed an overflow error. Hence I have to either (i) implement another identification algorithm or (ii) restrict the maximum depth of a tree to some value. Since the maximum depth restriction is common practice I will open another issue on its implementation.

timmens commented 4 years ago

Follow up: I decided to reimplement how nodes are given ids which will addressed in another issue (#3).

Maximum depth: There has been opened another issue to address the maximum depth problem (#2).