Support for rtx 3xxx, 4xxx

Kotlin / kotlindl

High-level Deep Learning Framework written in Kotlin and inspired by Keras

Apache License 2.0

1.47k stars 103 forks source link

Support for rtx 3xxx, 4xxx #537

Open rychuelektryk opened 1 year ago

rychuelektryk commented 1 year ago

Hi,

It seams that due to dependency on testorflow 1.15 it is impossible to use kotlindl with gpu support on nvidia rtx 3xxx, 4xxx gpus. From the information I've gathered it's because those gpus require cuda 11, and tesnorflow 1.15 is not compatible with cuda 11. I also found, that nvidia has fork of tensorflow 1.15 that support newer cuda versions but if I'm correct it's only available for python ecosystem.

Is there any way to use kotlindl with 3xxx, 4xxx gpus? If not when are you planning to add such support? These gpus are great for machine learing and it's a shame that they cannot be used with kotlindl.

zaleslaw commented 1 year ago

Hi, we depends here on the used runtime and have no plans for the next 6 months to adopt new TF versions, but we will check it for the onnx runtime and android devices.

I am not agree that this is a shame, but the exisiting limitation

rychuelektryk commented 1 year ago

I am not agree that this is a shame, but the exisiting limitation

Don't get me wrong. I higly appreciate your work and I'm just eager to try kotlindl with my new gpu. Guess I simply chose improper words to express it

litclimbing commented 1 year ago

I spent about half a day trying to get it to work on my 4xxx gpu to no success. This library looks awesome and I'd love to get it working in some capacity (with tf 2 or some other workaround)

RaphaelTarita commented 1 year ago

I would also like to express the wish for a TensorFlow upgrade. As far as I've understood it, the currently-used TF version will not receive updates and will eventually not be maintained anymore (or isn't maintained anymore already? I don't know). It only makes sense to upgrade to the new TensorFlow Java impl because the entire project builds on it.

I came across this problem quite a while ago already, and to be honest I think it's quite shocking that it still hasn't been addressed or isn't planned in the near future. Without the upgrade, KotlinDL is essentially outdated and no other development really makes sense. When the old TensorFlow goes EOL, what will happen to KotlinDL?

zaleslaw commented 1 year ago

We experimented with migration to the new TF version, but it takes a long time and resources for now, this is why it's not on the roadmap.

On the other side, TF Java doesn't solve all the problems of the old version, this is why it's frozen for unclear time.

I participated in TF Java development a little bit and I know the strong and weak parts of this solution.

cromefire commented 1 year ago

This also most likely makes support for AMDs 7000 series more difficult/impossible as they most likely don't run on such an old TF version as that version doesn't even seem to run on ROCm 5 as tensorflow-rocm 1.15 seems to have been updated last 2 years ago. (for the purposes of training not ONNX inference).

alchitry commented 7 months ago

I've come across the ability to run tensorflow 1.15.x on 40 series cards using docker and NFC https://forums.developer.nvidia.com/t/can-nvidia-tensorflow-1-x-be-used-with-rtx-4090/241211

Should this make it possible to get GPU acceleration with KotlinDL or does the reliance on old Cuda/other libraries prevent that?