-
Currently the website shows no information on nominal attributes in a dataset with a numeric target.
See: https://www.openml.org/d/41022
Season, Series,... have no information om their distributio…
-
I have created a custom GroupContext extension to store some application specific data. I noticed while building out the welcome flow that the extension isn't present in the `VerifiableGroupInfo.exten…
-
Suppose I want to have an OpenML task with train/test splits that I have created myself according to some criteria, how would I go about doing that? The [relevant REST-API reference page](https://www.…
mb706 updated
5 years ago
-
### Issue Description
The background dataset always gets subsampled to 100 samples
### Minimal Reproducible Example
```python
from sklearn.datasets import fetch_openml
import shap
import n…
-
I am not sure how our metafeature names were chosen. Perhaps they are based on the R script from which this project stemmed. Perhaps we should consider the literature and the usage by OpenML and D3M t…
-
#### Description
For a small set of flows, the `predictions.arff` files of some runs contain faulty entries. In these entries, the prediction does not correspond to the class with the highest confide…
-
Hi,
We must have a feature selection that is not manual. Gisele recommended this one:
https://spark.apache.org/docs/2.2.0/ml-features.html#chisqselector
The issue (mentioned by @waltersf ) is…
-
It would be nice to have a function to extract the hyper parameters for several run ids. The current way is to use getOMLRun() for each single run id, which is really slow.
-
Research if categorical_encoding should be a parameter that is optimized in AutoML. I have found that sometimes setting this to a value other than AUTO improves results: https://github.com/h2oai/h2o-…
-
For multi-class classification tasks, measures such _mean_weighted_area_under_roc_curve_, _mean_precision_, and _mean_weighted_precision_ are not computed. Instead, _area_under_roc_curve_ and _precisi…