NiklasPfister / adaXT

adaXT: tree-based machine learning in Python
https://niklaspfister.github.io/adaXT/
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

Predict weight matrix #15

Closed svbrodersen closed 10 months ago

svbrodersen commented 12 months ago

Given n data points for training and m new data points. Let x_j be a data point for training and y_i be a new data point, then let A be the matrix where A_i,j is 1 if x_j and y_i are in the same LeafNode, and 0 otherwise.

NiklasPfister commented 11 months ago

I thought a bit more about this and think it would be nice to have to functions for this:

  1. A function that computes the in-sample weight matrix (this is the (n x n) weight matrix we already compute and which can be done fast).
  2. A function that takes a new matrix of X-points, say Xnew that is (m x d) and then similar to the predict function find where in the fitted tree each of the m-points lands and then compute an (m x m) weight matrix indicating which new points lie in the same leaf nodes. This should be in cython similar to the predict function.