Closed flippercy closed 3 years ago
Hi,
Thanks for the kind words, much appreciated!
Note 1) max_nodes = max_leaves - 1 Note 2) For prediction, also the arrays 'bins' ([n_features x n_bins] float32) and the initial prediction (yhat_0, float32 scalar) are required. These are saved too when calling 'model.save'.
Hope this helps! I'll have a look at implementing 1) in the coming days.
Kind regards,
Olivier
@elephaint Thank you for the quick turnaround! Looking forward to the new features in the upgrade.
Hi,
predict_dist
function. If you specify yhat_dist = predict_dist(X_test, output_sample_statistics=True)
, the function will return a tuple (forecasts, mean, variance) with the latter two the learned mean and variance per sample, which can subsequently be used to specify a distribution of your choice. Hope this helps, let me know,
Olivier
Thank you Olivier. Appreciate your help!
Great, happy to help!
Hi @elephaint:
Have the upgrades been implemented yet? I've upgraded my library to 1.0 but seen no change.
yhat_dist_pgbm = model.predict_dist(data_val_B_X, n_forecasts=100, output_sample_statistics=True)
_TypeError: predict_dist() got an unexpected keyword argument 'output_samplestatistics'
print(inspect.getargspec(model.predict_dist))
_ArgSpec(args=['self', 'X', 'nforecasts', 'parallel'], varargs=None, keywords=None, defaults=(100, True))
print(PGBM._version)
1
Thank you!
Hi,
How unfortunate, that is very strange. The predict_dist function should support an output_sample_statistics keyword argument. In both the PyTorch and Numba version it is available (I've rechecked the source code and it should be in there...)
If you do pip list
in the (virtual) Python environment where you installed PGBM, what version of PGBM is listed?
Did you make sure to force the upgrade, for example by doing pip install pgbm --force-reinstall
?
Thank you for the quick turnaround. That's what I got after running pip install pgbm --force-reinstall :
So it turns out that the new argument is still missing.
Strange and frustrating! I've (i) (re)installed from PyPi in a new virtual environment, (ii) installed on a different pc with a new Python virtual environment, (ii) run in Google Colab and I still can't reproduce the issue. I've tested on Windows 10 (desktop), MacOS (MacBook Pro) and Linux (Colab), and for both Numba and Torch versions.
Can you run this example in Google Colab? In that example, you should be able to call the predict_dist
function with the output_sample_statistics
argument, i.e. output = model.predict_dist(X, output_sample_statistics=True)
I'm trying to think of what could go wrong... are you certain !pip list
is executed from the same (virtual) environment as in which you execute PGBM? What kind of (Python) setup are you running? It feels as if there is a cached version somewhere left in the environment that apparently the code calls when executing the inspect
calls.
It is weird. I reinstalled JupyterHub on my linux server and it works now. Not sure what happened.
Thank you very much for your help!
Great! Python's package manager is a mystery sometimes....
Hi:
Thank you for the awesome library! I did some tests with it and have a few questions:
How to pull the parameters, such as mean and standard deviation, of the final fitted distribution for each leaf? Such information is extremely helpful when the result is presented and explained to stakeholders. Currently the model just returns some numbers sampled from the distribution but business users are likely to focus on the distribution itself.
Is there anyway to spit out the model's tree structure to a data frame like what get_dump() does for xgboost?
Thank you!