ClimbsRocks / auto_ml

[UNMAINTAINED] Automated machine learning for analytics & production
http://auto-ml.readthedocs.io
MIT License
1.64k stars 310 forks source link

Questions #338

Closed onacrame closed 7 years ago

onacrame commented 7 years ago

I'm relatively new to Github so apologies if this is posted in the wrong place.

  1. I'd assume not, but I was wondering if, where feature learning was switched on, those composite features could be inspected (the values themselves and what the feature is comprised of).

  2. Is it possible to inspect the transformed dataset after it's been scaled, categorical values have been dealt with and features are generated?

  3. The Calibrate Final Model option only works if a y_test is supplied. Where do I feed in a y_test? It's not clear from the API

  4. Brier Loss Score seems to be the default scorer. If I wanted to use 'roc_auc' is that possible?

  5. How do I inspect the parameters of the model e.g., for a logistic regression, the alpha (once calibrated), the intercept etc.

ClimbsRocks commented 7 years ago
  1. ha, you're diving into a long debate around the interpretibility of deep learning algos. unfortunately, no, as an industry, we don't have any great way of understanding what makes up deep learning features. you can inspect what the features values are yourself though if you clone down the repo, and modify the source code here to do some logging: https://github.com/ClimbsRocks/auto_ml/blob/master/auto_ml/predictor.py#L601

  2. ditto- modify the source code here to do more logging: https://github.com/ClimbsRocks/auto_ml/blob/master/auto_ml/utils_model_training.py#L61

  3. pass them in as params to .train: ml_predictor.train(data, X_test=some_data, y_test=some_data). good call on the missing documentation- i'll make that a todo.

  4. this is still a beta feature, but you should be able to pass in ml_predictor.train(data, scoring='roc_auc'). note that this is for scoring and not an objective function.

  5. a bit of a hack, but pass in ml_predictor.train(data, training_params={}). training_params overwrites our defaults with whatever you pass in, and then gives you the whole set with both our values and yours.