sebp / scikit-survival

Survival analysis built on top of scikit-learn
GNU General Public License v3.0
1.14k stars 216 forks source link

RandomSurvivalForest Not Influenced By Class Weights #470

Closed marcoistasy closed 4 months ago

marcoistasy commented 4 months ago

(may be related to this and this issues.)

Passing sample weights to RandomSurvivalForest seems to have no influence on model outcome. Alterins sw in the code below does not impact the model score.

import numpy as np

from sklearn import set_config
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OrdinalEncoder

from sksurv.datasets import load_gbsg2
from sksurv.preprocessing import OneHotEncoder
from sksurv.ensemble import RandomSurvivalForest

X, y = load_gbsg2()
grade_str = X.loc[:, "tgrade"].astype(object).values[:, np.newaxis]
grade_num = OrdinalEncoder(categories=[["I", "II", "III"]]).fit_transform(grade_str)
X_no_grade = X.drop("tgrade", axis=1)
Xt = OneHotEncoder().fit_transform(X_no_grade)
Xt.loc[:, "tgrade"] = grade_num
random_state = 20
X_train, X_test, y_train, y_test = train_test_split(Xt, y, test_size=0.25, random_state=random_state)

sw = np.where(y_train['cens'] == 1, 100, 1).astype(float)
rsf = RandomSurvivalForest(n_estimators=1000, min_samples_split=10, min_samples_leaf=15, n_jobs=-1, random_state=random_state,)
rsf.fit(X_train, y_train, sample_weight=sw)
rsf.score(X_test, y_test)

Expected Results Model score to change with changing sw (for instance, 1 -> 100).

Actual Results image

Versions


scikit-survival   : 0.22.2
scikit-learn      : 1.3.0
numpy             : 1.24.3
scipy             : 1.10.1
pandas            : 1.5.3
numexpr           : 2.8.4
ecos              : 2.0.14
osqp              : 0.6.7.post0
joblib            : 1.4.2
matplotlib        : 3.7.2
pytest            : None
sphinx            : None
Cython            : None
pip               : 24.0
setuptools        : 69.5.1```
sebp commented 4 months ago

This has been fixed in version 0.23.0

See https://github.com/sebp/scikit-survival/issues/464