dask / dask-xgboost

BSD 3-Clause "New" or "Revised" License
162 stars 43 forks source link

fix nthread param typo #16

Closed evanfwelch closed 6 years ago

evanfwelch commented 6 years ago

I was having issues with distributed training where I was seeing a race condition develop. As soon as I manually implemented nthread= i realized that core.py improperly sets this parameter as nthreads and therefore xgboost ignores it and goes to the default behavior.

mrocklin commented 6 years ago

Confirmed. Thank you for finding this. I'm curious, do you know of any way to check the parameters we send to XGBoost before passing them? Is there anything we can do to avoid situations like this in the future?

evanfwelch commented 6 years ago

Thanks Matt. (By the way, the package is great and I love quickly you got it out the door). I dont know of anything on the XGBoost side that helps us check the parameters we're sending. But perhaps, given the fact that dask_xgboost supports a subset of full XGBoost functionality (e.g. you can't yet pass eval data to facilitate early stopping) we could build in a sort of definitive list of options and then check the keys of the params and kwargs against it. If we throw warning when users give unsupported params it might make the package a lot more usable. I had to do a bit of digging to figure out that adding evaluation data was not going to work (since you can't pickle a DMatrix).

mrocklin commented 6 years ago

OK, if there is no automated way to verify keywords then lets merge this for now. Thanks for finding and fixing this @evanfwelch