Open hrkadkhodaei opened 2 years ago
I get the same error as above when fitting a 430k rows dataset with 31 columns, but the same dataset scaled down to 43k rows works.
I got the same error. I tried to set the parameter natural_gradient
as False to omit line grad = np.linalg.solve(fisher_matrix, grad)
that is causing this, but then appears this warning:
C:\Users\scyperski\Anaconda3\envs\cost_prediction\lib\site-packages\xgboost_distribution\distributions\normal.py:65: RuntimeWarning: divide by zero encountered in divide grad[:, 0] = (loc - y) / var C:\Users\scyperski\Anaconda3\envs\cost_prediction\lib\site-packages\xgboost_distribution\distributions\normal.py:66: RuntimeWarning: divide by zero encountered in divide grad[:, 1] = 1 - ((y - loc) ** 2) / var C:\Users\scyperski\Anaconda3\envs\cost_prediction\lib\site-packages\xgboost_distribution\distributions\normal.py:78: RuntimeWarning: divide by zero encountered in divide hess[:, 0] = 1 / var C:\Users\scyperski\Anaconda3\envs\cost_prediction\lib\site-packages\xgboost_distribution\distributions\normal.py:79: RuntimeWarning: divide by zero encountered in divide hess[:, 1] = 2 * ((y - loc) ** 2) / var
As a result the predictions are full of NaNs. It seems like it's caused by log_scale
array (in gradient_and_hessian
method) which elements are to small and rounded to 0 after:
var = np.exp(2 * log_scale)
As a workaround I added this line before calculating the exponential:
log_scale = np.clip(log_scale, -20, 20)
So far it works even with natural_gradient
parameter as True.
Hi, Thanks for raising / debugging. Does anyone have an example data set / method of fitting where this happens?
I got the permission from my workplace to share a sample dataset after its anonymization. I also prepared a minimal code snippet to reproduce this problem. Please contact me at szymoncyperski@gmail.com (dataset is quite heavy).
Thanks, appreciate this. I've got a slight preference for finding a public dataset, just so it's easier to add to the test suite, so I'll have look at this first and get back to you if I can't reproduce.
Okay, I was able to reproduce the error with some datasets and merged a fix (#86) which is available in the latest release (xgboost-distribution==0.2.7
). However, depending on the data, there could still be issues here, so please let me know if this error still occurs.
Still got the same issue with negative-binomial
. If this is still being maintained, let me know and I'll get an MRE together.
I've similarly found that the size of the dataset makes a difference. Up to about 40k rows is fine, above that the error occurs. It doesn't seem related to the contents of the dataset (e.g. for a 1M row dataset, all the 40k chunks are independently fine, but passed in together cause the error)
Yes, it is still maintained. Do you have any details on the error that you're seeing (or data for reproducible example)? The above was related to numeric overflow errors, so if that's the issue, it may just need safer limits for negative-binomial
.
When I run the following code snipper I get an error "numpy.linalg.LinAlgError: Singular matrix"
X_train, y_train, X_test, y_test = read_data(InEx) model = XGBDistribution(distribution="normal", n_estimators=500) model.fit(X_train, y_train, eval_set=[(X_test, y_test)], early_stopping_rounds=10)
The full error:
`D:\Python37\lib\site-packages\xgboost_distribution\distributions\normal.py:89: RuntimeWarning: overflow encountered in exp D:\Python37\lib\site-packages\xgboost_distribution\distributions\normal.py:61: RuntimeWarning: overflow encountered in exp Traceback (most recent call last): File "D:\Python37\lib\contextlib.py", line 130, in exit self.gen.throw(type, value, traceback) File "D:\Python37\lib\site-packages\xgboost\config.py", line 140, in config_context yield File "D:\Python37\lib\site-packages\xgboost_distribution\model.py", line 181, in fit callbacks=callbacks, File "D:\Python37\lib\site-packages\xgboost\training.py", line 196, in train early_stopping_rounds=early_stopping_rounds) File "D:\Python37\lib\site-packages\xgboost\training.py", line 81, in _train_internal bst.update(dtrain, i, obj) File "D:\Python37\lib\site-packages\xgboost\core.py", line 1685, in update grad, hess = fobj(pred, dtrain) File "D:\Python37\lib\site-packages\xgboost_distribution\model.py", line 254, in obj y=y, params=params, natural_gradient=self.natural_gradient File "D:\Python37\lib\site-packages\xgboost_distribution\distributions\normal.py", line 72, in gradient_and_hessian grad = np.linalg.solve(fisher_matrix, grad) File "<__array_function__ internals>", line 6, in solve File "D:\Python37\lib\site-packages\numpy\linalg\linalg.py", line 394, in solve r = gufunc(a, b, signature=signature, extobj=extobj) File "D:\Python37\lib\site-packages\numpy\linalg\linalg.py", line 88, in _raise_linalgerror_singular raise LinAlgError("Singular matrix") numpy.linalg.LinAlgError: Singular matrix
Process finished with exit code 1 ` The training and test data contain 13 float features (X) and 1 integer target (y)