dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.23k stars 8.72k forks source link

objective=binary:logistic with num_class=2 throws error #4552

Closed thesuperzapper closed 4 years ago

thesuperzapper commented 5 years ago

On the latest master (XGBoost-0.9 SNAPSHOT), I get a error when using objective=binary:logistic with num_class=2. In this case I am using the Spark API, but I think it might be caused by something independent of the xgboost4j.

The error dose not occur if no num_class is specified.

Example:

import ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier

val dataPath = "SPARK_HOME/data/mllib/sample_binary_classification_data.txt"

val data = spark.read.format("libsvm").option("vectorType", "dense").load(dataPath)
val dataSplit = data.randomSplit(Array(0.8, 0.2), seed = 1000)
val dataTrain = dataSplit(0)
val dataTest = dataSplit(1)

val paramMap = Map(
    "eta" -> 0.1f,
    "max_depth" -> 2,
    "objective" -> "binary:logistic",
    "num_class" -> 2,
    "num_round" -> 5,
    "num_workers" -> 2
)
val xgbClassifier = new XGBoostClassifier(paramMap)
    .setFeaturesCol("features")
    .setLabelCol("label")

val xgboostModel = xgbClassifier.fit(dataTrain)

Executor stack trace:

ml.dmlc.xgboost4j.java.XGBoostError: [17:12:54] /xgboost/src/objective/regression_obj.cu:55: Check failed: preds.Size() == info.labels_.Size() (78 vs. 39) : labels are not correctly providedpreds.size=78, label.size=39
Stack trace:
  [bt] (0) /data/disk9/yarn/container-logs/usercache/XXXXXXX/appcache/application_1557180119926_286753/container_e149_1557180119926_286753_01_000002/tmp/libxgboost4j3559256344967066582.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x35) [0x7f1dbb7a07f5]
  [bt] (1) /data/disk9/yarn/container-logs/usercache/XXXXXXX/appcache/application_1557180119926_286753/container_e149_1557180119926_286753_01_000002/tmp/libxgboost4j3559256344967066582.so(xgboost::obj::RegLossObj<xgboost::obj::LogisticClassification>::GetGradient(xgboost::HostDeviceVector<float> const&, xgboost::MetaInfo const&, int, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*)+0x202) [0x7f1dbb8c39a2]
  [bt] (2) /data/disk9/yarn/container-logs/usercache/XXXXXXX/appcache/application_1557180119926_286753/container_e149_1557180119926_286753_01_000002/tmp/libxgboost4j3559256344967066582.so(xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*)+0x293) [0x7f1dbb8492a3]
  [bt] (3) /data/disk9/yarn/container-logs/usercache/XXXXXXX/appcache/application_1557180119926_286753/container_e149_1557180119926_286753_01_000002/tmp/libxgboost4j3559256344967066582.so(XGBoosterUpdateOneIter+0x35) [0x7f1dbb7a52e5]
  [bt] (4) [0x7f1df9017a34]
kaer1990 commented 5 years ago

I also have this problem

trivialfis commented 5 years ago

I don't see this on Python side.

RAMitchell commented 5 years ago

This is a tricky one, I don't think num_class should generally be set for binary classification. The reason is that the internal implementation for binary classification is specialised compared to the multiclass implementation. After setting the num_class parameter the internal implementation thinks you want to output predictions for each of your class labels, which is redundant in the binary case. For user experience I think we should log that this parameter should not be set with the binary classification objective.

trivialfis commented 5 years ago

@RAMitchell Num class is quite difficult to handle, it ties to num_output_group, also objective. Do you think it's a good idea to remove all these auto configuration then just emit an error to tell users to set correct parameters? This way all these "heuristics" can be removed.

nanffiy commented 4 years ago

@thesuperzapper

The error dose not occur if no num_class is specified.

I meet same error when use xgb branch of your pyspark_api that is good job。 hope your job quickly merge to master's.

Thanks。

bethunebtj commented 4 years ago

Come into the same problem. Either change to multi class ,or remove num_classes can work around. Agree with @RAMitchell ,the document should mention about this.

xgboost version is 0.90, xgboost4j-spark is also 0.90