Closed janvanrijn closed 6 years ago
Here's some python code:
import sklearn.metrics
import sklearn.preprocessing
import sklearn.pipeline
import sklearn.ensemble
import openml
task = openml.tasks.get_task(146800)
X, y = task.get_X_and_y()
for i in range(10):
train_indices, test_indices = task.get_train_test_split_indices(fold=i)
X_train = X[train_indices]
y_train = y[train_indices]
X_test = X[test_indices]
y_test = y[test_indices]
preproc = sklearn.preprocessing.Imputer()
tree = sklearn.ensemble.RandomForestClassifier(n_estimators=512)
pipeline = sklearn.pipeline.Pipeline([
('imputer', preproc), ('tree', tree),
])
pipeline.fit(X_train, y_train)
print(sklearn.metrics.accuracy_score(y_test, pipeline.predict(X_test)))
however, at the moment I cannot find a classifier which goes up to 100% accuracy. Too bad that William does no provide hyperparameters.
Judging by flow names, @WilliamRaynaut does not use the vanilla Weka converter.
Closed according to Skype call 15/3/18 (@frank-hutter @giuseppec @mfeurer @janvanrijn )
As raised by @joaquinvanschoren
https://www.openml.org/t/146800