bioimage-io / JDLL

The Java library to run Deep Learning models
https://github.com/bioimage-io/JDLL/wiki
Apache License 2.0
27 stars 6 forks source link

Backend of the model-runner tensors #1

Closed carlosuc3m closed 1 year ago

carlosuc3m commented 2 years ago

Hello everyone, In this issue I want to propose the library nd4j as the new backend for the model runner tensors, at least temporarily. As we discussed previously, for convenience the backend was going to be the NDArrays from DJL. However, I found that these NDArrays need an underlying native library (or 'engine' as they call it) to work. Currently there are only 2 engines with the capability of supporting NDArrays.

This limitation might suppose conflicts because the native library dor those two engines will have to be always loaded to use these NDArrays. This is why I would recommend using another backend. At the moment I have continued developing the library on another branch using another library, called Nd4j, as the backend.

This library works in a similar manner to the DJL. It uses again Java-cpp as the backend to load C++ native libraries (openblas for example) and its arrays (called INDArrays) have predetermined operations such as mean and allows accessing positions via indexing in a similar manner to numpy arrays.

On the other hand, the memory management of this INDArrays is not the best and we should be quite careful with it.

If you agree with this, I can merge the branch into the main one, and we can keep this solution at least temporarily. I think that at some point we should also move away from this library because it is quite heavy and because of its memory management but I think that it is the fastest and simplest transition at the moment.

I also looked at a JNI for Numpy, which seems quite nice (also using Java-cpp) but it is almost like writing C++ in JAva. It is not simple at all. Regards, Carlos

@Stephane-D @tinevez @tomburke-rse @petebankhead @KateMoreva @xion16lm

constantinpape commented 2 years ago

This sounds good from my side; my only concern is whether this would cause any issues currently for @KateMoreva, who is building on the code here and is wrapping up her thesis on this; so it would be nice to have any major dependency changes that cause extra work for her right now.

tomburke-rse commented 2 years ago

I'm not sure if I like a memory inefficient/bad library for a memory heavy task like this in Java, BUT I haven't truly checked it out yet so it's just an subjective opinion at this point. I'll look deeper into it in 2 weeks and give an updated, constructive reply then.

carlosuc3m commented 2 years ago

Hello again everyone,

Even though I am going to explain it in more detail in the meeting of the 28th, I just wanted to let you know that due to the dependency management of the numerous OS dependant native libraries, which was too complex, I decided to change the backend of the tensors to ImgLib2. I honestly should have looked at this before because it works great and there are no problems with the memory management as everything is in Java.

The code is under the "imglib2" branch and unless any of you opposes to ImgLib2 I will continue working with it. REgards, Carlos

@Stephane-D @tinevez @tomburke-rse @petebankhead @KateMoreva @xion16lm

constantinpape commented 2 years ago

Hi Carlos, from our end this is great since all our functionality is based on imglib2 already, so this will make integration of your backend much easier.

carlosuc3m commented 1 year ago

Tensors are based on ImgLib2