rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.21k stars 530 forks source link

How to get libcuml.so and call the library through java to use ML algorithm? #3269

Open Matrix-World opened 3 years ago

Matrix-World commented 3 years ago

What is your question? Hello, I am new to github and cuml projects, and I have a question. In https://github.com/rapidsai/cuml, some algorithms in the python module do not seem to use the files in the cpp file, such as cuml/python/cuml/linear_model/logistic_regression.pyx. Instead, use the cupy module to directly interact with cuda to achieve GPU acceleration, right? So is it possible to think that the python module can be regarded as the python interface of cuml, and the cpp module can be regarded as the c++/cuda interface of cuml. I want to call some ML algorithms in the cpp module through java, so I should compile the cpp module into libcuml.so first, and then call it through JNI, right? Can I get libcuml.so by following the operation of cuml/BUILD.md? I don't quite understand the difference between libcuml++ and libcuml.so in the cpp module.

teju85 commented 3 years ago

That's correct. For algos which we want to be implemented quickly and/or not perf-critical, we directly implement them via python layer using cupy. For the rest, we implement them with C++/cuda interface, expose them on libcuml++.so and then wrap them with cython .pyx files.

If you want to call the ML algos that are exposed in libcuml++.so, then yes, you'll first have to build/download libcuml++.so and then wrap them with JNI calls (similar to how we currently wrap them in cython). libcuml++.so is our C++ interface while libcuml.so is the C interface. libcuml.so is just a thin wrapper around libcuml++.so. However, please do note that libcuml.so currently does NOT have wrappers to all the algos that are exposed on C++ interface!

And yes, you'll get libcuml.so also by following the build instructions in BUILD.md.

@Matrix-World thank you for considering writing a JNI-wrapper for cuml! If you are able to do this, we'd love to hear your feedback on our C++ interface too.

Tagging @JohnZed , JFYI.

Matrix-World commented 3 years ago

That's correct. For algos which we want to be implemented quickly and/or not perf-critical, we directly implement them via python layer using cupy. For the rest, we implement them with C++/cuda interface, expose them on libcuml++.so and then wrap them with cython .pyx files.

If you want to call the ML algos that are exposed in libcuml++.so, then yes, you'll first have to build/download libcuml++.so and then wrap them with JNI calls (similar to how we currently wrap them in cython). libcuml++.so is our C++ interface while libcuml.so is the C interface. libcuml.so is just a thin wrapper around libcuml++.so. However, please do note that libcuml.so currently does NOT have wrappers to all the algos that are exposed on C++ interface!

And yes, you'll get libcuml.so also by following the build instructions in BUILD.md.

@Matrix-World thank you for considering writing a JNI-wrapper for cuml! If you are able to do this, we'd love to hear your feedback on our C++ interface too.

Tagging @JohnZed , JFYI.

thx, I want to know, if I want to call libcuml++.so library through java, do I also need to use libcuml.so. Or these two can be used separately, and librmm.so is also needed? Do you have a website provided to download libcuml++.so and libcuml.so that you have compiled?

teju85 commented 3 years ago

libcuml.so is a wrapper over libcuml++.so. So, if you want to wrap libcuml++.so from your JNI wrapper, then you do NOT need libcuml.so. For now, I'd say that the easiest way for you to get started with writing this wrapper would be as follows:

  1. Install conda
  2. Then install cuML package following the instructions from this page.
  3. Assuming you have conda installed at /opt/conda, and you are working out of its default environment, then you should be able to find header files at /opt/conda/include/cuml and the shared-object at /opt/conda/lib/libcuml++.so.
Matrix-World commented 3 years ago

libcuml.so is a wrapper over libcuml++.so. So, if you want to wrap libcuml++.so from your JNI wrapper, then you do NOT need libcuml.so. For now, I'd say that the easiest way for you to get started with writing this wrapper would be as follows:

1. Install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/)

2. Then install cuML package following the instructions from [this](https://rapids.ai/start.html#get-rapids) page.

3. Assuming you have conda installed at `/opt/conda`, and you are working out of its default environment, then you should be able to find header files at `/opt/conda/include/cuml` and the shared-object at `/opt/conda/lib/libcuml++.so`.

Thanks, I get it. I have another question, The kmeans algorithm in cuml/cpp/src/kmeans is implemented by cuda, right? kmeans in cuml/cpp/examples/kmeans is implemented in c++/cuda, right? Why doesn't kmeans_example use cuml/cpp/src/kmeans directly? If I get libcudf++.so and use java to call kmeans to achieve GPU acceleration, the corresponding kmeans algorithm should be in cuml/cpp/src/kmeans? I don't quite understand the role of multiple files in this directory.

github-actions[bot] commented 3 years ago

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

teju85 commented 3 years ago

Sorry for the delayed response. (I somehow missed your last message @Matrix-World !)

The kmeans algorithm in cuml/cpp/src/kmeans is implemented by cuda, right?

Yes, that's correct

kmeans in cuml/cpp/examples/kmeans is implemented in c++/cuda, right? Why doesn't kmeans_example use cuml/cpp/src/kmeans directly?

this is a sample code showing how to use the C++ API exposed by libcuml++.so. In other words, while reading through kmeans_example, assume that this code were outside of cuML repo and all you had access to were the header files in /opt/conda/include/cuml and shared library in /opt/conda/lib/libcuml++.so. Now this means, you cannot directly use files under cuml/cpp/src/kmeans. Hope this helps.

github-actions[bot] commented 3 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

nguyenvietyen commented 1 year ago

Another approach is to build a bridge using the JavaCPP project: https://github.com/bytedeco/javacpp