-
```
import os
import numpy as np
import sklearn
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.preprocessing import LabelEncoder
from sklearn.utils.multiclass import un…
-
## Description
If the training dataset was construcrted with free_raw_data = True, it is possible to use it only once. Trying to continue training (using init_model parameter) leads to an error:
…
-
https://github.com/apache/arrow/pull/37797 added a Python layer for working with the C Data Interface through capsules and defined dunder methods, described at https://arrow.apache.org/docs/dev/format…
-
Getting this warning when training my model:
```
DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all…
-
MLJ recently added a new API through which implementations can use their own optimised resampling methods, if available, rather than slicing into the tabular data, which can permit performance enhance…
-
Hi,
The Waterfall plot example from git does not work with RandomForestRegressor and throws this error:
`Exception: waterfall_plot requires a scalar base_values of the model output as the first …
-
TL;DR Original List with yet-to-be implemented FE algorithms in https://github.com/parrt/random-forest-importances/issues/54
Seeing https://github.com/interpretml/interpret/issues/364 and https://g…
-
Hi all,
I am running the following config on Databricks
Scala: 2.12
Spark: 3.1.2
I installed the following jars for mmlspark in Databricks:
- mmlspark-1.0.0-rc4.jar
- mmlspark-core-1.0.0-r…
-
**Describe the bug**
Using SynpaseML package for applying LightGBM algorithm in a spark application. This spark application is running in the k8s environment. We have observed that even after the com…
-
Hello! I'm facing a situation where I have 20 million rows of data and 30,000 features participating in training, but I only have 50 trees. In this case, I've noticed that the training process is very…