Closed thesuperzapper closed 4 years ago
I also have this problem
I don't see this on Python side.
This is a tricky one, I don't think num_class should generally be set for binary classification. The reason is that the internal implementation for binary classification is specialised compared to the multiclass implementation. After setting the num_class parameter the internal implementation thinks you want to output predictions for each of your class labels, which is redundant in the binary case. For user experience I think we should log that this parameter should not be set with the binary classification objective.
@RAMitchell Num class is quite difficult to handle, it ties to num_output_group, also objective. Do you think it's a good idea to remove all these auto configuration then just emit an error to tell users to set correct parameters? This way all these "heuristics" can be removed.
@thesuperzapper
The error dose not occur if no num_class is specified.
I meet same error when use xgb branch of your pyspark_api that is good job。 hope your job quickly merge to master's.
Thanks。
Come into the same problem. Either change to multi class ,or remove num_classes can work around. Agree with @RAMitchell ,the document should mention about this.
xgboost version is 0.90, xgboost4j-spark is also 0.90
On the latest master (XGBoost-0.9 SNAPSHOT), I get a error when using
objective=binary:logistic
withnum_class=2
. In this case I am using the Spark API, but I think it might be caused by something independent of the xgboost4j.The error dose not occur if no
num_class
is specified.Example:
Executor stack trace: