Open musram opened 2 years ago
@trivialfis could you pls give some hint to fix this issue?
Let me take a closer look.
We need to find a way to handle inf gracefully in jvm package. Models like Poisson are prone to floating-point overflow and underflow.
@trivialfis yes it happens in JVM package. As I am using distributed environment, I have to use JVM based package.
@musram, Could you have the minium script and data to repro it?
I am facing the issue for objective "count:poisson" but works fine for the objective function. I am using data bricks environment, importing xgboost as import ml.dmlc.xgboost4j.scala.spark.{XGBoostRegressor}. This issue looks the same as https://github.com/dmlc/xgboost/issues/4849
My environment is databricks with scala 2.12 and spark 3.2.1
can anybody help?
java.lang.NumberFormatException: For input string: "inf" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122) at java.lang.Float.parseFloat(Float.java:451) at java.lang.Float.valueOf(Float.java:416) at ml.dmlc.xgboost4j.java.Booster.evalSet(Booster.java:243) at ml.dmlc.xgboost4j.java.XGBoost.trainAndSaveCheckpoint(XGBoost.java:231) at ml.dmlc.xgboost4j.java.XGBoost.train(XGBoost.java:304) at ml.dmlc.xgboost4j.scala.XGBoost$.$anonfun$trainAndSaveCheckpoint$5(XGBoost.scala:66) at scala.Option.getOrElse(Option.scala:189) at ml.dmlc.xgboost4j.scala.XGBoost$.trainAndSaveCheckpoint(XGBoost.scala:62) at ml.dmlc.xgboost4j.scala.XGBoost$.train(XGBoost.scala:106) at ml.dmlc.xgboost4j.scala.spark.XGBoost$.buildDistributedBooster(XGBoost.scala:416) at ml.dmlc.xgboost4j.scala.spark.XGBoost$.$anonfun$trainForNonRanking$1(XGBoost.scala:499) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:868) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:868) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:380) at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:393) at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1486) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1413) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1477) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1296) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:391) at org.apache.spark.rdd.RDD.iterator(RDD.scala:342) at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$3(ResultTask.scala:75) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$1(ResultTask.scala:75) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:55) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:153) at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:122) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.Task.run(Task.scala:93) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:824) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1641) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:827) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:683) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)