When attempting to train a piecewise_estimator function, an error is consistently produced when using the standard (n_samples,n_features) sklearn format. The use of boolean indexing on line 296 of the _apply_prediction_method function is creating this issue.
from sklearn.tree import DecisionTreeRegressor
from mlinsights.mlmodel import PiecewiseRegressor
model = PiecewiseRegressor(verbose=True,
binner=DecisionTreeRegressor(min_samples_leaf=300))
model.fit(X_train,y_train)
vvc_predict = model.predict(X_test)
plot_customer(customer1)
plt.plot(X_test,vvc_predict,'g.',label='VVC_predict',alpha=0.2)
plt.legend()
Yields the following errors:
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 60 out of 60 | elapsed: 0.0s finished
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-13-63d7c4526d72> in <module>
6
7 model.fit(X_train,y_train)
----> 8 vvc_predict = model.predict(X_test)
9
10 plot_customer(customer1)
~\anaconda3\lib\site-packages\mlinsights\mlmodel\piecewise_estimator.py in predict(self, X)
350 :return: predictions
351 """
--> 352 return self._apply_predict_method(
353 X, "predict", _predict_piecewise_estimator, self.dim_)
354
~\anaconda3\lib\site-packages\mlinsights\mlmodel\piecewise_estimator.py in _apply_predict_method(self, X, method, parallelized, dimout)
294 if ind is None:
295 continue
--> 296 pred[ind] = p
297 indall = numpy.logical_or(indall, ind) # pylint: disable=E1111
298
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions
By observing the TypeError, it seems numpy wants 0 or 1 dimensional input for boolean indexing. But reshaping is incompatible with the mlinsights library.
Hello,
When attempting to train a piecewise_estimator function, an error is consistently produced when using the standard (n_samples,n_features) sklearn format. The use of boolean indexing on line 296 of the _apply_prediction_method function is creating this issue.
For example, for the following data:
print(X_train.shape,y_train.shape) print(X_test.shape)
(23476, 1) (23476, 1) (11564, 1)
Attempting to train the model in this fashion:
Yields the following errors:
By observing the TypeError, it seems numpy wants 0 or 1 dimensional input for boolean indexing. But reshaping is incompatible with the mlinsights library.
I have attempted to solve this using a mask: https://github.com/sdpython/mlinsights/pull/94 which lets me use PiecewiseRegressor successfully,
But it seems my contribution isn't correct based on the checks.