Open jgman86 opened 3 months ago
Thanks for reaching out! If I remember correctly, I have encountered those warnings/errors, but could use the GPU anyway. In the last line it says it is able to create a GPU device. What is the output of tf.config.list_physical_devices('GPU')
, does a GPU show up there or is it an empty list?
Regarding the 2.16 version of TensorFlow: We are currently working on adapting BF to Keras 3, which will also enable using the newest TensorFlow version. As this is a bigger change, it will need a bit more time, you can track the progress in the PR #159
Hey Valentin, yes It finds the gpu - but from my searches the errors at the beginning indicate, that the libs are not used then, which results in slower training. and its also suspicious, that those errors are absent, before the bayesflow installtion ! yes I already track the streamline-backend branch and can't wait to finally use torch as backend ;-)
Ahh, ok. Is the error absent when you install TF 2.15.0 without BayesFlow? Or is there a version change with the BF installation?
As this seems a usual problem according to this issue, we probably have to find which TF-related dependency changes and introduces the problem, though I don't know whether we can resolve this. Could you try the following?
conda list --explicit > env-pre-bf.txt
conda list --explicit > env-post-bf.txt
diff env-pre-bf.txt env-post-bf.txt
The resulting diff may tell us something about the changes BF introduces to the TensorFlow installation, which might give us a lead on what we need to fix
I completely understand it is challenging, but it would be great if the new BF release could support tensorflow-cpu 2.16 too.
Absolutely. The new release will support all recent tensorflow, pytorch, and jax versions.
Hey guys,
we are currently working with bayesflow on our hpc cluster (at least we try) due to virtualization. In the process of setting up the images, we noticed, that an existing tensorflow installation which loads all cuda modules as intended, is overwritten by a subsequernt installation of tensorflow when installing bf. This seems intended, if the version dependency for tensorflow is not met. However, during the installation process, something seems to happen which breaks the cuda libraries, even if a vanilla version of tensorflow which mets the version dependency is previously installed:
I have the same problem on Ubuntu 22.04 LTS, where I can't use bayesflow with the cuDNN, cuFFT and cuBLAS libs. I thought this was due to the buggy nature of tf, but now I think that something related to the Version dependencies. Is it possible to include the latest 2.16.1 Version of tf in the bf dependencies ? Or is it not yet compatible ? Additinoally there is now a tensorflow[and-cuda] package available over pip, which includes all necessary cuda libs. it would be great to include such versions to the bf dependencies if poossible, to streamline the installation process. In our case, usage on a cluster is therefore tricky, as we had to try and error different bf and tf versions, which actually work with the provided infrastructure!
has anybody experienced the same issues ?