Closed samyip123 closed 4 years ago
If I understand correctly, you are trying to pass a nested set of arrays?
This is not a tested feature, so I can't say for sure what's going on without a minimal example to debut. If you can create a simple toy example that replicates the bug I'll take a look.
One option in the meantime is to simply flatten the data-structure into 2-D arrays and use column selection to get the right data. It's not harder than this:
from mlens.preprocessing import Subset
class CustomSubset:
def __init__(self, x_cols, y_cols):
self.x_trans = Subset(x_cols)
self.y_trans = Subset(y_cols)
def fit(self, x, y):
return self
def transform(self, x, y):
return self.x_trans(x), self.y_trans(y)
def fit_transform(self, x, y):
return self.transform(x, y)
pipes = {"pipe-1": [CustomSubset([0, 3], [0, 8])], ...}
ests = {"pipe-1": [est_1, est_2, ...], ...}
ens.add(estimators=ests, preprocessing=pipes)
Close due to inactivity.
I am trying to fit the ensemble with the following structure with each instance being a ndarray
X_Dataset : ndarray( [ndarray([3 columns],[3 columns],[3 columns]), ndarray([3 columns],[3 columns]),...]) y_Dataset : ndarray( [ndarray([8 columns],[8 columns],[8 columns]), ndarray([8 columns],[8 columns]),...])
Yet when i fit the model, i encountered an error when it tries to sort the y dataset
File "/mlens/parallel/base.py", line 181, in _setup_2multiplier self.classes = y File "/mlens/parallel/base.py", line 202, in classes_ self._classes = np.unique(y).shape[0]
ValueError: operands could not be broadcast together with shapes (3,8) (2,8)
Is there any requirement that mandates the y dataset to be 1-D? thanks