Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
Other
44
stars
26
forks
source link
Adds new way of defining DMatrix using off-heap memory populated in Java #51
We can avoid using "2D" constructors in DMatrix by using off-heap memory
directly. 2D constructors have a lot of overhead associated with the flatten operation.
For large matrices, it can be an issue that the matrix is in native memory
2x and also 2 additional times in java memory (DKV and arrays).
This change will allows H2O to have the matrix represented in native memory just once and once in DKV.
We can avoid using "2D" constructors in DMatrix by using off-heap memory directly. 2D constructors have a lot of overhead associated with the flatten operation.
For large matrices, it can be an issue that the matrix is in native memory 2x and also 2 additional times in java memory (DKV and arrays).
This change will allows H2O to have the matrix represented in native memory just once and once in DKV.
Example use: https://github.com/h2oai/h2o-3/pull/2822