iterative / example-repos-dev

Source code and generator scripts for example DVC projects
https://dvc.org/doc
21 stars 13 forks source link

get-started: use fractional min_split #119

Closed dberenbaum closed 2 years ago

dberenbaum commented 2 years ago

Using a fractional value for min_split (see https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) reduces overfitting and scales better to different sample sizes.

train.min_split=2

visualization(6)

train.min_split=0.01

visualization(7)

shcheklein commented 2 years ago

@dberenbaum thanks, I'll regenerate the project with this change.