bethatkinson / rpart

Recursive Partitioning and Regression Trees
43 stars 23 forks source link

Accessing sample indexes for custom splitting (Not covered by documentation) #27

Open hcbh96 opened 3 years ago

hcbh96 commented 3 years ago

Hello there.

I am currently writing some custom split methods that aim to deal with non-iid datasets. I've been following the code in tests/usersplits.R in order to attempt to right the custom methods I have in mind. However, I have run into a problem.

The method I have in mind requires me to access subsets of the initial dataset at each node and possible subnode, this information will then be used to change the goodness of the split. One way of thinking about this is that I need to access the rows of the initial dataset that get passed into the temp2 function in the tests/usersplits.R. From there I can read in the required values from the GlobalEnv.

Another way of putting this is that I need to know the corresponding row for each y passed into the function named temp2 in tests/usersplits.R. Is there a good way of accessing this info?

This is the temp2 function from tests/usersplits.R


# The split function, where most of the work occurs.
#   Called once per split variable per node.
# If continuous=T
#   The actual x variable is ordered
#   y is supplied in the sort order of x, with no missing,
# ...
temp2 <- function(y, wt, x, parms, continuous) {

    {...}

    }

Please let me know if there is any way of accessing this information from the temp2 function?