-
In the documentation of KFold (with shuffle=True), and ShuffleSplit as CV generators, it's not really clear what are the differences.
In ShuffleSplit one is able to choose the number of splits and …
-
Folllowing [sklearn interface](https://github.com/scikit-learn/scikit-learn/blob/0864c5896804bc1066c02bcb1443c962cabdc420/sklearn/model_selection/_split.py#L109):
* other KFold implementations: 'Grou…
-
The difference in accuracy between sklearn and cuml RF varies in the range of 3-7% (3% difference obtained after hyper-parameter tuning) for the below example. The base code for the example below is t…
-
Hi, im trying to calibrate logistic regression classifier and i get the error ValueError: could not convert string to float: 'OLIFE',
I did onehotencode my categorical values using pipeline, it works…
-
**Describe the use case**
Training multiple models with shuffled training data (k-folds) can reveals information about our data sets. For example, one fold may be much less accurate than other folds …
-
**Describe the bug**
I have been investing the accuracy bug in cuML RF (#2518), and I managed to isolate the cause of the accuracy drop. **The bootstrapping option causes cuML RF to do worse than skl…
hcho3 updated
3 years ago
-
Scikit-learn 0.20 should allow you to reuse more of the existing BaseSearchCV infrastructure, by providing a protected method `_run_search`:
```py
def _run_search(self, evaluate_candidates):
…
-
IMO, there are some modules/packages that don't add any value to the project or are of low interest for English speaking users. By removing them, the codebase gets cleaner and we no longer have to mai…
-
Hi Sarthak,
when I try to train on the toy dataset with the samples/config_classification.yaml I get the error `/bin/sh: 1: qsub: not found`. I believe this originates from 'parallel_compute_comman…
-
Hello, I would like to ask questions about dataset and training.
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2KI6IH
After I downloaded your dataset, there is a folder na…