Closed bkowshik closed 7 years ago
What would it look like when attributes are added in order of importance for prediction instead of in the order they appear in the csv dataset?
The GradientBoostingClassifier
provides a method, model.feature_importances_
that gives out scores for feature importance, the higher the score the more important the feature for predictions.
Table with 10 attributes that have the highest importance scores
Now, using the same workflow as ^, we add one attribute at a time but starting with the most important attributes to get the graph below.
After increasing the dataset size, still see the unusual dips. đŸ¤”
Similar to work on training size, we have questions on effect of number of attributes on model:
Workflow
Notes
cc: @anandthakker @batpad @geohacker