Open thvasilo opened 2 years ago
Is it the case that multi-threading is only relevant when there's more than one row in the input DMatrix?
Yes, currently multi-threading is only useful when you have multiple rows in the input DMatrix. The rows of the DMatrix get distributed equally across worker threads.
I am trying to call the predict in a multi-threading way (i.e., multiple threads calling the predict instead of multiple worker threads in the predictor), so I set the thread to 1 so threads are not blocked by the synchronization. However, I found out that the JavaCPP library used by the ND4J doesn't allow multi-threading as well, see here https://github.com/bytedeco/javacpp/blob/d23879af7a03a04c12b2374ae9d0850b9dda9d96/src/main/java/org/bytedeco/javacpp/Pointer.java#L699
Any particular reason that we need to use INDArray from ND4J?
Hello,
I'm running some benchmarks for treelite4j, testing out different batch sizes (splitting up a dataset into batches and predicting for each batch in sequence) and the number of threads passed to the
Predictor
object.One thing I'm observing is that the number of threads set in the Predictor only seems to matter when my batch size is larger than 1, i.e. if I create a
DMatrix
with only a single row and call Predict on it, the number of threads the Predictor object was created with doesn't seem to matter.Also, batch size doesn't seem to have a large effect when prediction is single threaded, is that expected as well?
Is it the case that multi-threading is only relevant when there's more than one row in the input DMatrix?
Would it be possible to use multi-threading for single-instance prediction as well, using each thread to predict for a single tree and merging the result in the end?
JMH results:
Some example code: