General Inquiry - Githubissues

fabsig / GPBoost

Combining tree-boosting with Gaussian process and mixed effects models

Other

530 stars 42 forks source link

Hi Fabio

I am currently very impressed with this lovely package you have created.

I was wondering when running predictions: gp_modelNestedBT = gpb.GPModel(group_data=group_data_train_BT,likelihood="binary") gp_modelNestedBT.fit(y=y_train_df['Target'], X=X_train2.drop(['patientID','btType'],axis = 1), params={'std_dev': True, "trace":"True","optimizer_cov": "gradient_descent"})

pred_resp_BT = gp_modelNestedBT.predict(X_pred=X_test2.drop(['patientID','btType'],axis=1), group_data_pred=group_data_test_BT, predict_var=True, predict_response=True)

1) The predictions of mean and variance have the same length as the index of X_test2. Can I further assume that the indices are the same? (i.e. X_test2.index == pred_resp_BT.index or are there any mix ups?) 2) When attempting to include nthread to fit(...) or to .GPModel(...) it does not permit it is that normal? 3) I also would like to ask (perhaps this might have been answered elsewhere) the mean and variance per row is the response and the uncertainty of that particular response? Hence I can create a very basic confidence interval for example for that row or would I have to bootstrap to create confidence intervals for the prediction?

Thank you for your time

Thank you for your interest in GPBoost!

The predictions of mean and variance have the same length as the index of X_test2. Can I further assume that the indices are the same? (i.e. X_test2.index == pred_resp_BT.index or are there any mix ups?)

Yes, every row in X_pred corresponds to the same index in the predictions (and the order is not mixed up).

When attempting to include nthread to fit(...) or to .GPModel(...) it does not permit it is that normal?

There is no nthread argument for these functions. GPBoost does OMP parallelization and just uses all available threads.

I also would like to ask (perhaps this might have been answered elsewhere) the mean and variance per row is the response and the uncertainty of that particular response? Hence I can create a very basic confidence interval for example for that row or would I have to bootstrap to create confidence intervals for the prediction?

Yes, exactly.

fabsig / GPBoost

General Inquiry #139