Open cnuernber opened 7 years ago
GLEW is an interesting parallel here but I think it solves a harder problem than the one we have in an environment (gl extensions) that was designed to be helpful in solving the problem.
I'd advocate at least trying smarter Java stuff before going into a full cross-platform wrapper library mode. Though, I also 100% agree that the problem could be thoroughly solved in that way (at some cost).
It does have to solve the extension problem but it also has to solve the which symbols are in the library problem. For example, there are different symbols available in GL3 than GL4.
In the sense that it needs to do dynamic symbol resolution at runtime after loading an indeterminate version of a shared library it is the same problem.
So, there are couple dynamic java things but the one that seems most promising is here and I agree understanding this route may help and would avoid the need for a cross platform build system especially considering the cuda bindings are all c interfaces:
https://github.com/jnr/jnr-ffi
On the other hand it isn't nearly as general as it will have limitations w/r/t the types of headers it understands and most likely binding with c++ so in the long run the cross platform build system will allow a higher quality ecosystem of bindings assuming someone wants to build/maintain it.
Sounds right. I know @charlesg3 messed with jnr a bit when doing XGBoost---not sure what the results there were.
jnr does allow dynamic binding to a library and doesn't require any information about headers... as such it sort of keeps the external library outside of your concern (just that it needs to exist on the LD_LIBRARY_PATH of the system)... which is super nice / minimal. The downside is that jnr has a bit of a low-level feel to it and the documentation is a bit sparse.
In case it's relevant for this discussion, there was a talk this year at Conj (2016) that focused pretty heavily on using jnr from Clojure.
This has bothered me for some time and there isn't too much I can do about it but here we go:
The runtime dependency on the cuda libraries is not ideal the way it is structured.
What people have done for many years with opengl is they bind to the actual shared library dynamically. They then look for the symbols they need in the shared library and those symbols along with the version of opengl detected (with an API call from the library) then dictates their path forward. They dynamically switch rendering paths depending on the feature set available in opengl and often times the specific hardware features available on the card.
Because the binding is dynamic, the program will start start of opengl isn't present but will exit with a nice error message. Also, because the binding is dynamic and they search for specific symbols in the shared library they can have one wrapper library that binds to several versions of opengl and it just exposes the symbols it finds.
This is the ideal situation. Currently in cortex for instance you have the change the project.clj in order to bind to a different version of cuda despite the fact that we aren't using any new features in that version and thus from a dynamic linking perspective this is unnecessary. This is a completely unnecessary incidental complexity that will come back to bite at some point.
The right answer here is to use an intermediate library that can do dynamic loading across the different platforms and find the symbols. You then set global pointers to the symbol value if it is found or not if it is not found (see gl wrangler: http://glew.sourceforge.net/).
Then we at least allow the program to decide if cuda is a necessary dependency and furthermore if particular versions of cuda (and cudnn, npp, cublas) are necessary dependencies What is stopping me from going there is a proper cross platform build system where I can build a library for at least linux, mac, and windows. That and the time required to actually do this.
There may be a solution in the dynamic linking facilities now present in Java but that path needs to be researched. To do this with javacpp we would need to build a small wrapper library that did the dynamic binding to the shared libraries and the symbols in the shared libraries.
In any case, a best-in-class CUDA development system would not have this issue. I suspect the same type of issue would be present should we decide to put effort into opencl.