Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
This is an issue with regards xgboost4j-spark hanging when training a model on Windows, when built from source against Scala 2.12.4.
Environment
XGBoost4j and XGBoost4j-spark 0.90
Windows 10, 64 bit, Home
Python 3.6.3
Scala 2.12.4
Java 8
CMake 3.15.4
Windows 8.1 SDK (via Visual Studio Community Edition 2015)
Steps to reproduce
First I built the jars, following the instructions here (https://xgboost.readthedocs.io/en/latest/jvm/), after which I can import the resulting jars into my project as unmanaged jars or install them into my local Maven repo. Note I had to make one change to create_jni.py, since my Python installation directory had spaces in it, so I worked around this by replacing sys.executable -> "C:\Progra~1\Python36\python.exe".
Next I run the simple example code below, just to put XGBoost4j-spark through its paces:
import ml.dmlc.xgboost4j.scala.spark.{TrackerConf, XGBoostClassifier}
import org.apache.spark.ml.feature.{StringIndexer, VectorAssembler}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.{DoubleType, StringType, StructField, StructType}
object SimpleXGBoost extends App {
val spark = SparkSession
.builder()
.appName("SimpleXGBoost Application")
.master("local[1]")
.getOrCreate()
spark.sparkContext.setLogLevel("INFO")
val schema = new StructType(Array(
StructField("sepal length", DoubleType, true),
StructField("sepal width", DoubleType, true),
StructField("petal length", DoubleType, true),
StructField("petal width", DoubleType, true),
StructField("class", StringType, true)))
val rawInput = spark.read.schema(schema).csv("C:/Data/iris.data.csv")
rawInput.show(10)
val stringIndexer = new StringIndexer().
setInputCol("class").
setOutputCol("classIndex").
fit(rawInput)
val labelTransformed = stringIndexer.transform(rawInput).drop("class")
val vectorAssembler = new VectorAssembler().
setInputCols(Array("sepal length", "sepal width", "petal length", "petal width")).
setOutputCol("features")
val xgbInput = vectorAssembler.transform(labelTransformed).select("features", "classIndex")
val tracker_conf = new TrackerConf(20 * 1000, "scala")
val xgbParam = Map("eta" -> 0.1f,
"objective" -> "multi:softprob",
"num_class" -> 3,
"num_round" -> 10,
"num_workers" -> 1,
"nthread" -> 1,
"tracker_conf" -> tracker_conf,
"verbosity" -> 3)
val xgbClassifier = new XGBoostClassifier(xgbParam).
setFeaturesCol("features").
setLabelCol("classIndex")
val xgbClassificationModel = xgbClassifier.fit(xgbInput)
spark.stop()
}
On running this code, everything is fine until it reaches the xgbClassifier.fit(xgbInput), this is where things hang until the Scala tracker times out. This produces the following stack trace:
2019-11-09 21:51:33,567 | INFO | Logging | 54 | Running Spark version 2.4.0
2019-11-09 21:51:33,822 | WARN | NativeCodeLoader | 62 | Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-11-09 21:51:33,922 | INFO | Logging | 54 | Submitted application: SimpleXGBoost Application
2019-11-09 21:51:34,007 | INFO | Logging | 54 | Changing view acls to: simon
2019-11-09 21:51:34,007 | INFO | Logging | 54 | Changing modify acls to: simon
2019-11-09 21:51:34,008 | INFO | Logging | 54 | Changing view acls groups to:
2019-11-09 21:51:34,009 | INFO | Logging | 54 | Changing modify acls groups to:
2019-11-09 21:51:34,009 | INFO | Logging | 54 | SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(simon); groups with view permissions: Set(); users with modify permissions: Set(simon); groups with modify permissions: Set()
2019-11-09 21:51:35,097 | INFO | Logging | 54 | Successfully started service 'sparkDriver' on port 61123.
2019-11-09 21:51:35,121 | INFO | Logging | 54 | Registering MapOutputTracker
2019-11-09 21:51:35,145 | INFO | Logging | 54 | Registering BlockManagerMaster
2019-11-09 21:51:35,147 | INFO | Logging | 54 | Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2019-11-09 21:51:35,148 | INFO | Logging | 54 | BlockManagerMasterEndpoint up
2019-11-09 21:51:35,160 | INFO | Logging | 54 | Created local directory at C:\Users\simon\AppData\Local\Temp\blockmgr-2cfe480c-59d8-4167-8cce-3e31a33fad90
2019-11-09 21:51:35,189 | INFO | Logging | 54 | MemoryStore started with capacity 4.1 GB
2019-11-09 21:51:35,206 | INFO | Logging | 54 | Registering OutputCommitCoordinator
2019-11-09 21:51:35,296 | INFO | Log | 192 | Logging initialized @7614ms
2019-11-09 21:51:35,365 | INFO | Server | 351 | jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-11-09 21:51:35,380 | INFO | Server | 419 | Started @7698ms
2019-11-09 21:51:35,406 | INFO | AbstractConnector | 278 | Started ServerConnector@6d0b5baf{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-11-09 21:51:35,407 | INFO | Logging | 54 | Successfully started service 'SparkUI' on port 4040.
2019-11-09 21:51:35,428 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@76ba13c{/jobs,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,429 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@44c79f32{/jobs/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,429 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@7fcbe147{/jobs/job,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,430 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@743cb8e0{/jobs/job/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,431 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@c7a975a{/stages,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,431 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@2c1b9e4b{/stages/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,432 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@757d6814{/stages/stage,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,432 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@4c168660{/stages/stage/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,433 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@52b56a3e{/stages/pool,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,434 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@fd0e5b6{/stages/pool/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,434 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@4eed46ee{/storage,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,435 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@36b0fcd5{/storage/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,435 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@4fad94a7{/storage/rdd,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,436 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@475835b1{/storage/rdd/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,436 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@6326d182{/environment,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,437 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@5241cf67{/environment/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,438 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@716a7124{/executors,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,438 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@77192705{/executors/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,439 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@226642a5{/executors/threadDump,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,439 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@7e809b79{/executors/threadDump/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,445 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@5cc126dc{/static,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,446 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@64337702{/,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,447 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@2cf92cc7{/api,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,447 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@611df6e3{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,448 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@5f2f577{/stages/stage/kill,null,AVAILABLE,@Spark}
2019-11-09 21:51:35,449 | INFO | Logging | 54 | Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-4869294:4040
2019-11-09 21:51:35,576 | INFO | Logging | 54 | Starting executor ID driver on host localhost
2019-11-09 21:51:35,644 | INFO | Logging | 54 | Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61136.
2019-11-09 21:51:35,645 | INFO | Logging | 54 | Server created on DESKTOP-4869294:61136
2019-11-09 21:51:35,646 | INFO | Logging | 54 | Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2019-11-09 21:51:35,675 | INFO | Logging | 54 | Registering BlockManager BlockManagerId(driver, DESKTOP-4869294, 61136, None)
2019-11-09 21:51:35,680 | INFO | Logging | 54 | Registering block manager DESKTOP-4869294:61136 with 4.1 GB RAM, BlockManagerId(driver, DESKTOP-4869294, 61136, None)
2019-11-09 21:51:35,683 | INFO | Logging | 54 | Registered BlockManager BlockManagerId(driver, DESKTOP-4869294, 61136, None)
2019-11-09 21:51:35,684 | INFO | Logging | 54 | Initialized BlockManager: BlockManagerId(driver, DESKTOP-4869294, 61136, None)
2019-11-09 21:51:35,840 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@46b695ec{/metrics/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,016 | INFO | Logging | 54 | Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/C:/GIT/blue-moon/spark-warehouse').
2019-11-09 21:51:36,017 | INFO | Logging | 54 | Warehouse path is 'file:/C:/GIT/blue-moon/spark-warehouse'.
2019-11-09 21:51:36,027 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@712ca57b{/SQL,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,027 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@4564e94b{/SQL/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,028 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@27abb83e{/SQL/execution,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,029 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@69e308c6{/SQL/execution/json,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,030 | INFO | ContextHandler | 781 | Started o.s.j.s.ServletContextHandler@1efdcd5{/static/sql,null,AVAILABLE,@Spark}
2019-11-09 21:51:36,662 | INFO | Logging | 54 | Registered StateStoreCoordinator endpoint
2019-11-09 21:51:38,579 | INFO | Logging | 54 | Pruning directories with:
2019-11-09 21:51:38,584 | INFO | Logging | 54 | Post-Scan Filters:
2019-11-09 21:51:38,587 | INFO | Logging | 54 | Output Data Schema: struct<sepal length: double, sepal width: double, petal length: double, petal width: double, class: string ... 3 more fields>
2019-11-09 21:51:38,593 | INFO | Logging | 54 | Pushed Filters:
2019-11-09 21:51:38,985 | INFO | Logging | 54 | Code generated in 203.6344 ms
2019-11-09 21:51:39,221 | INFO | Logging | 54 | Code generated in 28.7679 ms
2019-11-09 21:51:39,309 | INFO | Logging | 54 | Block broadcast_0 stored as values in memory (estimated size 220.3 KB, free 4.1 GB)
2019-11-09 21:51:40,436 | INFO | Logging | 54 | Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.6 KB, free 4.1 GB)
2019-11-09 21:51:40,439 | INFO | Logging | 54 | Added broadcast_0_piece0 in memory on DESKTOP-4869294:61136 (size: 20.6 KB, free: 4.1 GB)
2019-11-09 21:51:40,441 | INFO | Logging | 54 | Created broadcast 0 from show at SimpleXGBoost.scala:24
2019-11-09 21:51:40,470 | INFO | Logging | 54 | Planning scan with bin packing, max size: 4198855 bytes, open cost is considered as scanning 4194304 bytes.
2019-11-09 21:51:40,599 | INFO | Logging | 54 | Starting job: show at SimpleXGBoost.scala:24
2019-11-09 21:51:40,622 | INFO | Logging | 54 | Got job 0 (show at SimpleXGBoost.scala:24) with 1 output partitions
2019-11-09 21:51:40,623 | INFO | Logging | 54 | Final stage: ResultStage 0 (show at SimpleXGBoost.scala:24)
2019-11-09 21:51:40,623 | INFO | Logging | 54 | Parents of final stage: List()
2019-11-09 21:51:40,624 | INFO | Logging | 54 | Missing parents: List()
2019-11-09 21:51:40,628 | INFO | Logging | 54 | Submitting ResultStage 0 (MapPartitionsRDD[3] at show at SimpleXGBoost.scala:24), which has no missing parents
2019-11-09 21:51:40,728 | INFO | Logging | 54 | Block broadcast_1 stored as values in memory (estimated size 15.1 KB, free 4.1 GB)
2019-11-09 21:51:40,730 | INFO | Logging | 54 | Block broadcast_1_piece0 stored as bytes in memory (estimated size 7.8 KB, free 4.1 GB)
2019-11-09 21:51:40,731 | INFO | Logging | 54 | Added broadcast_1_piece0 in memory on DESKTOP-4869294:61136 (size: 7.8 KB, free: 4.1 GB)
2019-11-09 21:51:40,732 | INFO | Logging | 54 | Created broadcast 1 from broadcast at DAGScheduler.scala:1161
2019-11-09 21:51:40,747 | INFO | Logging | 54 | Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at show at SimpleXGBoost.scala:24) (first 15 tasks are for partitions Vector(0))
2019-11-09 21:51:40,748 | INFO | Logging | 54 | Adding task set 0.0 with 1 tasks
2019-11-09 21:51:40,809 | INFO | Logging | 54 | Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7786 bytes)
2019-11-09 21:51:40,819 | INFO | Logging | 54 | Running task 0.0 in stage 0.0 (TID 0)
2019-11-09 21:51:41,117 | INFO | Logging | 54 | Reading File path: file:///C:/Data/iris.data.csv, range: 0-4551, partition values: [empty row]
2019-11-09 21:51:41,135 | INFO | Logging | 54 | Code generated in 11.3314 ms
2019-11-09 21:51:41,201 | INFO | Logging | 54 | Finished task 0.0 in stage 0.0 (TID 0). 1378 bytes result sent to driver
2019-11-09 21:51:41,209 | INFO | Logging | 54 | Finished task 0.0 in stage 0.0 (TID 0) in 420 ms on localhost (executor driver) (1/1)
2019-11-09 21:51:41,211 | INFO | Logging | 54 | Removed TaskSet 0.0, whose tasks have all completed, from pool
2019-11-09 21:51:41,217 | INFO | Logging | 54 | ResultStage 0 (show at SimpleXGBoost.scala:24) finished in 0.575 s
2019-11-09 21:51:41,221 | INFO | Logging | 54 | Job 0 finished: show at SimpleXGBoost.scala:24, took 0.621541 s
+------------+-----------+------------+-----------+-----------+
|sepal length|sepal width|petal length|petal width| class|
+------------+-----------+------------+-----------+-----------+
| 5.1| 3.5| 1.4| 0.2|Iris-setosa|
| 4.9| 3.0| 1.4| 0.2|Iris-setosa|
| 4.7| 3.2| 1.3| 0.2|Iris-setosa|
| 4.6| 3.1| 1.5| 0.2|Iris-setosa|
| 5.0| 3.6| 1.4| 0.2|Iris-setosa|
| 5.4| 3.9| 1.7| 0.4|Iris-setosa|
| 4.6| 3.4| 1.4| 0.3|Iris-setosa|
| 5.0| 3.4| 1.5| 0.2|Iris-setosa|
| 4.4| 2.9| 1.4| 0.2|Iris-setosa|
| 4.9| 3.1| 1.5| 0.1|Iris-setosa|
+------------+-----------+------------+-----------+-----------+
only showing top 10 rows
2019-11-09 21:51:41,350 | INFO | Logging | 54 | Pruning directories with:
2019-11-09 21:51:41,351 | INFO | Logging | 54 | Post-Scan Filters: AtLeastNNulls(n, class#4)
2019-11-09 21:51:41,352 | INFO | Logging | 54 | Output Data Schema: struct<class: string>
2019-11-09 21:51:41,354 | INFO | Logging | 54 | Pushed Filters:
2019-11-09 21:51:41,380 | INFO | Logging | 54 | Code generated in 12.924 ms
2019-11-09 21:51:41,406 | INFO | Logging | 54 | Block broadcast_2 stored as values in memory (estimated size 220.3 KB, free 4.1 GB)
2019-11-09 21:51:41,420 | INFO | Logging | 54 | Block broadcast_2_piece0 stored as bytes in memory (estimated size 20.6 KB, free 4.1 GB)
2019-11-09 21:51:41,422 | INFO | Logging | 54 | Added broadcast_2_piece0 in memory on DESKTOP-4869294:61136 (size: 20.6 KB, free: 4.1 GB)
2019-11-09 21:51:41,424 | INFO | Logging | 54 | Created broadcast 2 from rdd at StringIndexer.scala:137
2019-11-09 21:51:41,425 | INFO | Logging | 54 | Planning scan with bin packing, max size: 4198855 bytes, open cost is considered as scanning 4194304 bytes.
2019-11-09 21:51:41,516 | INFO | Logging | 54 | Starting job: countByValue at StringIndexer.scala:140
2019-11-09 21:51:41,793 | INFO | Logging | 54 | Registering RDD 10 (countByValue at StringIndexer.scala:140)
2019-11-09 21:51:41,796 | INFO | Logging | 54 | Got job 1 (countByValue at StringIndexer.scala:140) with 1 output partitions
2019-11-09 21:51:41,796 | INFO | Logging | 54 | Final stage: ResultStage 2 (countByValue at StringIndexer.scala:140)
2019-11-09 21:51:41,796 | INFO | Logging | 54 | Parents of final stage: List(ShuffleMapStage 1)
2019-11-09 21:51:41,798 | INFO | Logging | 54 | Missing parents: List(ShuffleMapStage 1)
2019-11-09 21:51:41,799 | INFO | Logging | 54 | Submitting ShuffleMapStage 1 (MapPartitionsRDD[10] at countByValue at StringIndexer.scala:140), which has no missing parents
2019-11-09 21:51:41,826 | INFO | Logging | 54 | Block broadcast_3 stored as values in memory (estimated size 19.3 KB, free 4.1 GB)
2019-11-09 21:51:41,856 | INFO | Logging | 54 | Block broadcast_3_piece0 stored as bytes in memory (estimated size 10.3 KB, free 4.1 GB)
2019-11-09 21:51:41,856 | INFO | Logging | 54 | Added broadcast_3_piece0 in memory on DESKTOP-4869294:61136 (size: 10.3 KB, free: 4.1 GB)
2019-11-09 21:51:41,857 | INFO | Logging | 54 | Created broadcast 3 from broadcast at DAGScheduler.scala:1161
2019-11-09 21:51:41,860 | INFO | Logging | 54 | Submitting 1 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[10] at countByValue at StringIndexer.scala:140) (first 15 tasks are for partitions Vector(0))
2019-11-09 21:51:41,860 | INFO | Logging | 54 | Adding task set 1.0 with 1 tasks
2019-11-09 21:51:41,863 | INFO | Logging | 54 | Starting task 0.0 in stage 1.0 (TID 1, localhost, executor driver, partition 0, PROCESS_LOCAL, 7775 bytes)
2019-11-09 21:51:41,863 | INFO | Logging | 54 | Running task 0.0 in stage 1.0 (TID 1)
2019-11-09 21:51:41,948 | INFO | Logging | 54 | Removed broadcast_0_piece0 on DESKTOP-4869294:61136 in memory (size: 20.6 KB, free: 4.1 GB)
2019-11-09 21:51:41,954 | INFO | Logging | 54 | Cleaned accumulator 20
2019-11-09 21:51:41,954 | INFO | Logging | 54 | Cleaned accumulator 16
2019-11-09 21:51:41,954 | INFO | Logging | 54 | Cleaned accumulator 22
2019-11-09 21:51:41,954 | INFO | Logging | 54 | Cleaned accumulator 12
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 28
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 10
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 29
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 14
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 21
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 4
2019-11-09 21:51:41,955 | INFO | Logging | 54 | Cleaned accumulator 11
2019-11-09 21:51:41,957 | INFO | Logging | 54 | Removed broadcast_1_piece0 on DESKTOP-4869294:61136 in memory (size: 7.8 KB, free: 4.1 GB)
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 18
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 5
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 17
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 26
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 9
2019-11-09 21:51:41,959 | INFO | Logging | 54 | Cleaned accumulator 25
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 1
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 7
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 6
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 13
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 8
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 27
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 24
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 15
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 19
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 3
2019-11-09 21:51:41,960 | INFO | Logging | 54 | Cleaned accumulator 0
2019-11-09 21:51:41,961 | INFO | Logging | 54 | Cleaned accumulator 2
2019-11-09 21:51:41,961 | INFO | Logging | 54 | Cleaned accumulator 23
2019-11-09 21:51:41,984 | INFO | Logging | 54 | Code generated in 8.0033 ms
2019-11-09 21:51:42,010 | INFO | Logging | 54 | Reading File path: file:///C:/Data/iris.data.csv, range: 0-4551, partition values: [empty row]
2019-11-09 21:51:42,018 | INFO | Logging | 54 | Code generated in 6.4778 ms
2019-11-09 21:51:42,066 | INFO | Logging | 54 | Finished task 0.0 in stage 1.0 (TID 1). 1828 bytes result sent to driver
2019-11-09 21:51:42,068 | INFO | Logging | 54 | Finished task 0.0 in stage 1.0 (TID 1) in 206 ms on localhost (executor driver) (1/1)
2019-11-09 21:51:42,069 | INFO | Logging | 54 | Removed TaskSet 1.0, whose tasks have all completed, from pool
2019-11-09 21:51:42,070 | INFO | Logging | 54 | ShuffleMapStage 1 (countByValue at StringIndexer.scala:140) finished in 0.267 s
2019-11-09 21:51:42,070 | INFO | Logging | 54 | looking for newly runnable stages
2019-11-09 21:51:42,071 | INFO | Logging | 54 | running: Set()
2019-11-09 21:51:42,071 | INFO | Logging | 54 | waiting: Set(ResultStage 2)
2019-11-09 21:51:42,071 | INFO | Logging | 54 | failed: Set()
2019-11-09 21:51:42,076 | INFO | Logging | 54 | Submitting ResultStage 2 (ShuffledRDD[11] at countByValue at StringIndexer.scala:140), which has no missing parents
2019-11-09 21:51:42,082 | INFO | Logging | 54 | Block broadcast_4 stored as values in memory (estimated size 3.9 KB, free 4.1 GB)
2019-11-09 21:51:42,084 | INFO | Logging | 54 | Block broadcast_4_piece0 stored as bytes in memory (estimated size 2.4 KB, free 4.1 GB)
2019-11-09 21:51:42,086 | INFO | Logging | 54 | Added broadcast_4_piece0 in memory on DESKTOP-4869294:61136 (size: 2.4 KB, free: 4.1 GB)
2019-11-09 21:51:42,086 | INFO | Logging | 54 | Created broadcast 4 from broadcast at DAGScheduler.scala:1161
2019-11-09 21:51:42,087 | INFO | Logging | 54 | Submitting 1 missing tasks from ResultStage 2 (ShuffledRDD[11] at countByValue at StringIndexer.scala:140) (first 15 tasks are for partitions Vector(0))
2019-11-09 21:51:42,087 | INFO | Logging | 54 | Adding task set 2.0 with 1 tasks
2019-11-09 21:51:42,093 | INFO | Logging | 54 | Starting task 0.0 in stage 2.0 (TID 2, localhost, executor driver, partition 0, ANY, 7141 bytes)
2019-11-09 21:51:42,093 | INFO | Logging | 54 | Running task 0.0 in stage 2.0 (TID 2)
2019-11-09 21:51:42,110 | INFO | Logging | 54 | Getting 1 non-empty blocks including 1 local blocks and 0 remote blocks
2019-11-09 21:51:42,111 | INFO | Logging | 54 | Started 0 remote fetches in 3 ms
2019-11-09 21:51:42,141 | INFO | Logging | 54 | Finished task 0.0 in stage 2.0 (TID 2). 1314 bytes result sent to driver
2019-11-09 21:51:42,143 | INFO | Logging | 54 | Finished task 0.0 in stage 2.0 (TID 2) in 51 ms on localhost (executor driver) (1/1)
2019-11-09 21:51:42,143 | INFO | Logging | 54 | Removed TaskSet 2.0, whose tasks have all completed, from pool
2019-11-09 21:51:42,144 | INFO | Logging | 54 | ResultStage 2 (countByValue at StringIndexer.scala:140) finished in 0.063 s
2019-11-09 21:51:42,144 | INFO | Logging | 54 | Job 1 finished: countByValue at StringIndexer.scala:140, took 0.628652 s
2019-11-09 21:51:42,521 | INFO | Logging | 54 | Pruning directories with:
2019-11-09 21:51:42,522 | INFO | Logging | 54 | Post-Scan Filters:
2019-11-09 21:51:42,522 | INFO | Logging | 54 | Output Data Schema: struct<sepal length: double, sepal width: double, petal length: double, petal width: double, class: string ... 3 more fields>
2019-11-09 21:51:42,522 | INFO | Logging | 54 | Pushed Filters:
2019-11-09 21:51:42,571 | INFO | Logging | 54 | Code generated in 28.7664 ms
2019-11-09 21:51:42,591 | INFO | Logging | 54 | Block broadcast_5 stored as values in memory (estimated size 220.3 KB, free 4.1 GB)
2019-11-09 21:51:42,602 | INFO | Logging | 54 | Block broadcast_5_piece0 stored as bytes in memory (estimated size 20.6 KB, free 4.1 GB)
2019-11-09 21:51:42,603 | INFO | Logging | 54 | Added broadcast_5_piece0 in memory on DESKTOP-4869294:61136 (size: 20.6 KB, free: 4.1 GB)
2019-11-09 21:51:42,604 | INFO | Logging | 54 | Created broadcast 5 from rdd at DataUtils.scala:162
2019-11-09 21:51:42,605 | INFO | Logging | 54 | Planning scan with bin packing, max size: 4198855 bytes, open cost is considered as scanning 4194304 bytes.
2019-11-09 21:51:42,643 | INFO | XGBoost$ | 492 | Running XGBoost 1.0.0-SNAPSHOT with parameters:
alpha -> 0.0
min_child_weight -> 1.0
sample_type -> uniform
base_score -> 0.5
colsample_bylevel -> 1.0
grow_policy -> depthwise
skip_drop -> 0.0
lambda_bias -> 0.0
silent -> 0
scale_pos_weight -> 1.0
seed -> 0
cache_training_set -> false
features_col -> features
num_early_stopping_rounds -> 0
label_col -> classIndex
num_workers -> 1
subsample -> 1.0
lambda -> 1.0
max_depth -> 6
probability_col -> probability
raw_prediction_col -> rawPrediction
tree_limit -> 0
num_class -> 3
custom_eval -> null
rate_drop -> 0.0
max_bin -> 16
train_test_ratio -> 1.0
use_external_memory -> false
objective -> multi:softprob
eval_metric -> merror
num_round -> 10
timeout_request_workers -> 1800000
missing -> NaN
checkpoint_path ->
tracker_conf -> TrackerConf(20000,scala)
tree_method -> auto
max_delta_step -> 0.0
eta -> 0.1
verbosity -> 3
colsample_bytree -> 1.0
normalize_type -> tree
custom_obj -> null
gamma -> 0.0
sketch_eps -> 0.03
nthread -> 1
prediction_col -> prediction
checkpoint_interval -> -1
2019-11-09 21:51:42,650 | WARN | ecutionParamsFactory | 175 | train_test_ratio is deprecated since XGBoost 0.82, we recommend to explicitly pass a training and multiple evaluation datasets by passing 'eval_sets' and 'eval_set_names'
2019-11-09 21:51:42,948 | INFO | Logging | 54 | Cleaned accumulator 54
2019-11-09 21:51:42,950 | INFO | Logging | 54 | Cleaned accumulator 68
2019-11-09 21:51:42,950 | INFO | Logging | 54 | Cleaned accumulator 34
2019-11-09 21:51:42,950 | INFO | Logging | 54 | Cleaned accumulator 76
2019-11-09 21:51:42,950 | INFO | Logging | 54 | Cleaned accumulator 51
2019-11-09 21:51:42,951 | INFO | Logging | 54 | Cleaned accumulator 64
2019-11-09 21:51:42,951 | INFO | Logging | 54 | Cleaned accumulator 46
2019-11-09 21:51:42,951 | INFO | Logging | 54 | Cleaned accumulator 74
2019-11-09 21:51:42,951 | INFO | Logging | 54 | Cleaned accumulator 77
2019-11-09 21:51:42,951 | INFO | Logging | 54 | Cleaned accumulator 30
2019-11-09 21:51:42,952 | INFO | Logging | 54 | Removed broadcast_3_piece0 on DESKTOP-4869294:61136 in memory (size: 10.3 KB, free: 4.1 GB)
2019-11-09 21:51:42,954 | INFO | Logging | 54 | Cleaned accumulator 37
2019-11-09 21:51:42,955 | INFO | Logging | 54 | Cleaned accumulator 70
2019-11-09 21:51:42,958 | INFO | Logging | 54 | Cleaned shuffle 0
2019-11-09 21:51:42,958 | INFO | Logging | 54 | Cleaned accumulator 55
2019-11-09 21:51:42,958 | INFO | Logging | 54 | Cleaned accumulator 67
2019-11-09 21:51:42,958 | INFO | Logging | 54 | Cleaned accumulator 43
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 33
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 72
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 31
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 42
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 75
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 81
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 48
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 40
2019-11-09 21:51:42,959 | INFO | Logging | 54 | Cleaned accumulator 36
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 47
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 49
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 73
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 32
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 41
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 58
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 84
2019-11-09 21:51:42,960 | INFO | Logging | 54 | Cleaned accumulator 45
2019-11-09 21:51:42,961 | INFO | Logging | 54 | Cleaned accumulator 52
2019-11-09 21:51:42,962 | INFO | Logging | 54 | Removed broadcast_2_piece0 on DESKTOP-4869294:61136 in memory (size: 20.6 KB, free: 4.1 GB)
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 39
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 56
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 85
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 60
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 80
2019-11-09 21:51:42,963 | INFO | Logging | 54 | Cleaned accumulator 62
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 35
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 61
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 69
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 79
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 63
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 83
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 38
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 57
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 59
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 65
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 71
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 78
2019-11-09 21:51:42,964 | INFO | Logging | 54 | Cleaned accumulator 82
2019-11-09 21:51:42,965 | INFO | Logging | 54 | Cleaned accumulator 50
2019-11-09 21:51:42,965 | INFO | Logging | 54 | Cleaned accumulator 66
2019-11-09 21:51:42,965 | INFO | Logging | 54 | Cleaned accumulator 53
2019-11-09 21:51:42,966 | INFO | Logging | 54 | Removed broadcast_4_piece0 on DESKTOP-4869294:61136 in memory (size: 2.4 KB, free: 4.1 GB)
2019-11-09 21:51:42,967 | INFO | Logging | 54 | Cleaned accumulator 44
[INFO] [11/09/2019 21:06:26.730] [RabitTracker-akka.actor.default-dispatcher-6] [akka://RabitTracker/user/Handler] Tracker listening @ 192.168.1.69:51236
[INFO] [11/09/2019 21:06:26.731] [RabitTracker-akka.actor.default-dispatcher-6] [akka://RabitTracker/user/Handler] Worker connection timeout is 20 seconds.
2019-11-09 21:06:26,746 | INFO | rkParallelismTracker | 84 | starting training with timeout set as 1800000 ms for waiting for resources
2019-11-09 21:06:26,763 | INFO | Logging | 54 | Starting job: foreachPartition at XGBoost.scala:524
2019-11-09 21:06:26,765 | INFO | Logging | 54 | Got job 2 (foreachPartition at XGBoost.scala:524) with 1 output partitions
2019-11-09 21:06:26,765 | INFO | Logging | 54 | Final stage: ResultStage 3 (foreachPartition at XGBoost.scala:524)
2019-11-09 21:06:26,765 | INFO | Logging | 54 | Parents of final stage: List()
2019-11-09 21:06:26,768 | INFO | Logging | 54 | Missing parents: List()
2019-11-09 21:06:26,769 | INFO | Logging | 54 | Submitting ResultStage 3 (MapPartitionsRDD[18] at mapPartitions at XGBoost.scala:409), which has no missing parents
2019-11-09 21:06:26,808 | INFO | Logging | 54 | Block broadcast_6 stored as values in memory (estimated size 40.3 KB, free 4.1 GB)
2019-11-09 21:06:26,811 | INFO | Logging | 54 | Block broadcast_6_piece0 stored as bytes in memory (estimated size 18.9 KB, free 4.1 GB)
2019-11-09 21:06:26,812 | INFO | Logging | 54 | Added broadcast_6_piece0 in memory on DESKTOP-4869294:60665 (size: 18.9 KB, free: 4.1 GB)
2019-11-09 21:06:26,813 | INFO | Logging | 54 | Created broadcast 6 from broadcast at DAGScheduler.scala:1161
2019-11-09 21:06:26,814 | INFO | Logging | 54 | Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[18] at mapPartitions at XGBoost.scala:409) (first 15 tasks are for partitions Vector(0))
2019-11-09 21:06:26,814 | INFO | Logging | 54 | Adding task set 3.0 with 1 tasks
2019-11-09 21:06:26,816 | INFO | Logging | 54 | Starting task 0.0 in stage 3.0 (TID 3, localhost, executor driver, partition 0, PROCESS_LOCAL, 7786 bytes)
2019-11-09 21:06:26,816 | INFO | Logging | 54 | Running task 0.0 in stage 3.0 (TID 3)
2019-11-09 21:06:26,905 | INFO | Logging | 54 | Code generated in 12.1707 ms
2019-11-09 21:06:27,118 | INFO | Logging | 54 | Reading File path: file:///C:/Data/iris.data.csv, range: 0-4551, partition values: [empty row]
[21:06:27] DEBUG: C:\GIT\xgboost\src\gbm\gbtree.cc:82: Using updaters: grow_colmaker,prune
[21:06:27] DEBUG: C:\GIT\xgboost\src\gbm\gbtree.cc:138: Using tree method: 2
[0] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 4 extra nodes, 0 pruned nodes, max_depth=2
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 10 extra nodes, 0 pruned nodes, max_depth=3
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[1] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[2] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[3] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[4] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[5] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=5
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[6] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[7] train-merror:0.020000
[8] train-merror:0.020000
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=5
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 10 extra nodes, 0 pruned nodes, max_depth=3
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 10 extra nodes, 0 pruned nodes, max_depth=3
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 12 extra nodes, 0 pruned nodes, max_depth=4
[21:06:27] INFO: C:\GIT\xgboost\src\tree\updater_prune.cc:78: tree pruning end, 1 roots, 10 extra nodes, 0 pruned nodes, max_depth=3
2019-11-09 21:06:27,204 | INFO | Logging | 54 | Block rdd_18_0 stored as values in memory (estimated size 224.0 B, free 4.1 GB)
2019-11-09 21:06:27,205 | INFO | Logging | 54 | Added rdd_18_0 in memory on DESKTOP-4869294:60665 (size: 224.0 B, free: 4.1 GB)
2019-11-09 21:06:27,220 | INFO | Logging | 54 | 1 block locks were not released by TID = 3:
[rdd_18_0]
2019-11-09 21:06:27,222 | INFO | Logging | 54 | Finished task 0.0 in stage 3.0 (TID 3). 1361 bytes result sent to driver
2019-11-09 21:06:27,223 | INFO | Logging | 54 | Finished task 0.0 in stage 3.0 (TID 3) in 407 ms on localhost (executor driver) (1/1)
2019-11-09 21:06:27,223 | INFO | Logging | 54 | Removed TaskSet 3.0, whose tasks have all completed, from pool
2019-11-09 21:06:27,224 | INFO | Logging | 54 | ResultStage 3 (foreachPartition at XGBoost.scala:524) finished in 0.453 s
2019-11-09 21:06:27,224 | INFO | Logging | 54 | Job 2 finished: foreachPartition at XGBoost.scala:524, took 0.461351 s
2019-11-09 21:06:46,794 | INFO | XGBoost$ | 530 | Rabit returns with exit code 3
2019-11-09 21:06:46,795 | ERROR | XGBoost$ | 550 | the job was aborted due to
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.postTrackerReturnProcessing(XGBoost.scala:655)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.$anonfun$trainDistributed$1(XGBoost.scala:532)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.$anonfun$trainDistributed$1$adapted(XGBoost.scala:507)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.collection.TraversableLike.map(TraversableLike.scala:234)
at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
at scala.collection.immutable.List.map(List.scala:295)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:506)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:191)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:40)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
at SimpleXGBoost$.delayedEndpoint$SimpleXGBoost$1(SimpleXGBoost.scala:49)
at SimpleXGBoost$delayedInit$body.apply(SimpleXGBoost.scala:6)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at SimpleXGBoost$.main(SimpleXGBoost.scala:6)
at SimpleXGBoost.main(SimpleXGBoost.scala)
2019-11-09 21:06:46,807 | INFO | AbstractConnector | 318 | Stopped Spark@6d0b5baf{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-11-09 21:06:46,809 | INFO | Logging | 54 | Stopped Spark web UI at http://DESKTOP-4869294:4040
2019-11-09 21:06:46,828 | INFO | Logging | 54 | MapOutputTrackerMasterEndpoint stopped!
2019-11-09 21:06:46,860 | INFO | Logging | 54 | MemoryStore cleared
2019-11-09 21:06:46,861 | INFO | Logging | 54 | BlockManager stopped
2019-11-09 21:06:46,862 | INFO | Logging | 54 | BlockManagerMaster stopped
2019-11-09 21:06:46,865 | INFO | Logging | 54 | OutputCommitCoordinator stopped!
2019-11-09 21:06:46,868 | INFO | Logging | 54 | Successfully stopped SparkContext
Exception in thread "main" ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.postTrackerReturnProcessing(XGBoost.scala:655)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.$anonfun$trainDistributed$1(XGBoost.scala:532)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.$anonfun$trainDistributed$1$adapted(XGBoost.scala:507)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.collection.TraversableLike.map(TraversableLike.scala:234)
at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
at scala.collection.immutable.List.map(List.scala:295)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:506)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:191)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:40)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
at SimpleXGBoost$.delayedEndpoint$SimpleXGBoost$1(SimpleXGBoost.scala:49)
at SimpleXGBoost$delayedInit$body.apply(SimpleXGBoost.scala:6)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at SimpleXGBoost$.main(SimpleXGBoost.scala:6)
at SimpleXGBoost.main(SimpleXGBoost.scala)
2019-11-09 21:06:46,871 | INFO | Logging | 54 | Shutdown hook called
2019-11-09 21:06:46,872 | INFO | Logging | 54 | Deleting directory C:\Users\simon\AppData\Local\Temp\spark-3c885b32-7eac-4905-8ba5-ef9777f61e23
[9] train-merror:0.020000
Process finished with exit code 1
It all begins to hang, specifically after the lines:
2019-11-09 21:06:27,224 | INFO | Logging | 54 | ResultStage 3 (foreachPartition at XGBoost.scala:524) finished in 0.453 s
2019-11-09 21:06:27,224 | INFO | Logging | 54 | Job 2 finished: foreachPartition at XGBoost.scala:524, took 0.461351 s
Then once the tracker times out, it logs the rest of what you can see in the stack trace.
What have you tried?
I looked around a number of issues, specifically these two which seem the closest to my problem:
From these issues I have tried a number of suggestions:
Using both the python and scala trackers.
Both publishing jars to local maven repository and adding as unmanaged jars.
Tried training on the suggested Iris data set, as well as my own data sets.
Adding an xgboost-tracker.properties file under resources, with the line specifying host-ip as localhost, 127.0.01, my computer name, computer ip, etc
Switching off Anti-virus/Firewalls
Mix and matching different Scala 2.12.[4, 8] and Spark 2.4.[0-4] versions in build.sbt
Tinkering around with Spark and XGBoost parameters, namely:
a) Spark setting local[s] and XGBoost setting num_workers=w ensuring that s >= w) and
b) Setting XGBoost settings num_early_stopping_rounds=5, disable_default_eval_metric=1, and maximize_evaluation_metrics=true to force an early end to the rounds
None of the above worked, however if I build the jars in Linux against Scala 2.12.4, and run the exact same code, everything works fine - so this issue seems to be Windows specific, in regards to communication with Spark running locally.
Is there something else I could try or something I have overlooked to get around this issue? Also are there any plans to build XGBoost4j(-spark) against Scala 2.12 at https://github.com/criteo-forks/xgboost-jars/releases in the near future?
Many thanks in advance, please let me know if I can provide more info/context.
Thanks for the detailed report. For running on Windows with distributed training please try latest master branch. But in general most of the tests are done on Linux.
This is an issue with regards xgboost4j-spark hanging when training a model on Windows, when built from source against Scala 2.12.4.
Environment
Steps to reproduce
First I built the jars, following the instructions here (https://xgboost.readthedocs.io/en/latest/jvm/), after which I can import the resulting jars into my project as unmanaged jars or install them into my local Maven repo. Note I had to make one change to
create_jni.py
, since my Python installation directory had spaces in it, so I worked around this by replacingsys.executable -> "C:\Progra~1\Python36\python.exe"
.Next I run the simple example code below, just to put XGBoost4j-spark through its paces:
On running this code, everything is fine until it reaches the
xgbClassifier.fit(xgbInput)
, this is where things hang until the Scala tracker times out. This produces the following stack trace:It all begins to hang, specifically after the lines:
Then once the tracker times out, it logs the rest of what you can see in the stack trace.
What have you tried?
I looked around a number of issues, specifically these two which seem the closest to my problem:
From these issues I have tried a number of suggestions:
local[s]
and XGBoost settingnum_workers=w
ensuring thats >= w
) and b) Setting XGBoost settingsnum_early_stopping_rounds=5
,disable_default_eval_metric=1
, andmaximize_evaluation_metrics=true
to force an early end to the roundsNone of the above worked, however if I build the jars in Linux against Scala 2.12.4, and run the exact same code, everything works fine - so this issue seems to be Windows specific, in regards to communication with Spark running locally.
Is there something else I could try or something I have overlooked to get around this issue? Also are there any plans to build XGBoost4j(-spark) against Scala 2.12 at https://github.com/criteo-forks/xgboost-jars/releases in the near future?
Many thanks in advance, please let me know if I can provide more info/context.