produvia / kryptos

Kryptos AI is a virtual investment assistant that manages your cryptocurrency portfolio
http://twitter.com/kryptos_ai
MIT License
48 stars 8 forks source link

Feature Importance #83

Open slavakurilyak opened 6 years ago

slavakurilyak commented 6 years ago

Goals

As a machine learning developer, I want to explain the output of any machine learning model, so that I can better interpret predictions and classifications.

As a machine learning developer, I want to analyze feature importances, so that I can perform diagnostics on the machine learning model and better understand the degree to which a predictive model relies on a particular feature.

Consider

  1. Consider using SHAP (SHapley Additive exPlanation), which claims "better agreement with human intuition through a user study, exponential improvements in run time, improved clustering performance, and better identification of influential features." (Arxiv, 2018)

  2. Consider using XGBoost's built-in plot_importance() method, which has importance_type (either “weight”, “gain”, or “cover”) or get_fscore() method

  3. Consider using skater library, developed by datascienceinc for model interpretation

Inspiration

screen shot 2018-07-08 at 10 49 20 pm

Source: New frontiers: Marcos Lopez de Prado on Machine Learning for finance, 2018

bukosabino commented 6 years ago

Yeah, I have thought about this idea last weeks.

But, we are predicting with a different XGBoost in every iteration (daily or minute). So, we need think in a strategy to get a classification of features' importance. Perhaps, we could try to do a mean of features' importance...?

slavakurilyak commented 6 years ago

Here's a modelling idea to consider:

  1. Create an XGBoost model to get the most important features (say Top 50 features)
  2. Use hyperopt to tune xgboost (see #63)
  3. Use top 10 models from tuned XGBoosts to generate predictions
  4. Clip the predictions to [0,20] range
  5. Use the average of these 10 predictions

Inspiration: MLWiz, 2017

slavakurilyak commented 6 years ago

@bukosabino Let's analyze feature importances. Skater's (see above) implementation is based on an "information theoretic criteria, measuring the entropy in the change of predictions, given a perturbation of a given feature. The intuition is that the more a model’s decision criteria depend on a feature, the more we’ll see predictions change as a function of perturbing a feature." (Skater, 2018)