ClimbsRocks / auto_ml

[UNMAINTAINED] Automated machine learning for analytics & production
http://auto-ml.readthedocs.io
MIT License
1.64k stars 310 forks source link

Fix XGBoost error #356

Closed a-holm closed 7 years ago

a-holm commented 7 years ago

It appears that the current XGBoost package that is installed with pip does not have the feature_importance_ attribute. Therefore if you install the xgboost package using pip install xgboost you will be unable to conduct feature extraction from the XGBClassifier or the XGBRegressor object.

I made a workaround after trying to check for feature_importance_ because if the newest version of XGBoost is installed from source then feature_importance_ works fine so it will likely exist in future versions. But currently the version available by pip install xgboost does not provide the attribute.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 80.06% when pulling 3d23699b0d9a1f3171665f8c5cffe1a3d50bbcdf on a-holm:patch-4 into 544d16bbc10479618d25d18751baafd0a5184dea on ClimbsRocks:master.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 80.06% when pulling 3d23699b0d9a1f3171665f8c5cffe1a3d50bbcdf on a-holm:patch-4 into 544d16bbc10479618d25d18751baafd0a5184dea on ClimbsRocks:master.

ClimbsRocks commented 7 years ago

Thanks for that! XGBoost has been a bit difficult to support for a while, with the outdated pip package, and some of the changes they've made on master since then.

Was this the only issue you encountered using XGBoost from pip? If so, i might switch the test suite back to installing xgboost from pip- much faster than manually compiling everything :)

ClimbsRocks commented 7 years ago

also let me know if you have any other usability thoughts on auto_ml. really appreciate the PRs

ClimbsRocks commented 7 years ago

just pushed a new release with your fixes to pypi- we're on 2.8.5 now

a-holm commented 7 years ago

This is a fun project, I have found some other issues with loading deep-learning h5 files. I don't exactly understand why yet, I will try to contribute a little when I understand the code more.

ClimbsRocks commented 7 years ago

thanks!

also, i just remembered part of why i moved away from the pip version of xgboost: they have an annoying bug around sparse matrices and categorical variables. if you are using xgboost, and you have categorical variables, i'd strongly recommend building from their master branch, rather than installing from pip.

thanks again for the contributions!