dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.25k stars 8.72k forks source link

Inplace Predict Java API (xgboost4j) #5951

Closed viswanathk closed 1 year ago

viswanathk commented 4 years ago

Hello.

I see that in the changelog for 1.1.0, a new API was added for thread safe prediction (inplace_predict). The current predict method exposed through xgboost4j is not thread safe, and hence marked as synchronized in the method.

Given that the inplace_predict can now do threadsafe prediction, what are the plans of adding this to the Java API? We use this library in a multithreaded Java application, and this can help boost performance.

Thanks, Viswanath.

trivialfis commented 4 years ago

Would you like to open a PR for it? Inplace prediction means XGBoost has to digest external data directly. I'm not familiar with the ecosystem of Java so I don't know what data types to support.

trivialfis commented 4 years ago

Also after https://github.com/dmlc/xgboost/pull/5853 normal predict function should also be thread safe in xgboost native code.

viswanathk commented 4 years ago

Good to know about #5853, but looks like it might take a while before we see it in release, and inplace_predict already does the job. Do you think it will be faster to add the Java wrapper to the inplace, than #5853 waiting for to make it release?

Would you like to open a PR for it? Inplace prediction means XGBoost has to digest external data directly. I'm not familiar with the ecosystem of Java so I don't know what data types to support.

I will go over the required changes - although I am not sure on where the gotchas can potentially be.

trivialfis commented 4 years ago

It's a blocking PR for the next release and we are now squashing the next release, so I would say pretty soon.

I will go over the required changes

I can help with issues in C++. @CodingCat expressed interest in inplace prediction before. So maybe @CodingCat can provide some suggestions on related topics around JVM bindings.

dozaza commented 4 years ago

Hi guys,

I'm working an online XGBoost prediction service, and want to use Java Multi thread for better performance.

As C++ code has resolved thread safe feature, simply removing "synchronize" key word in Java "predict" method, could enable multi-thread prediction in Java Environment?

trivialfis commented 4 years ago

The C API is thread safe now. You need to check whether JAVA layer has additional states.

viswanathk commented 4 years ago

Java layer doesn't seem to maintain any state that get's updated on the predict flow. Raised the PR https://github.com/dmlc/xgboost/pull/6021

Who's the maintainer / reviewer for the Java package?

fangpings commented 3 years ago

Any update on this issue?

sihanwang94 commented 3 years ago

Need this feature +1