Open utterances-bot opened 1 year ago
Hi, I encountered similar headaches and some of the packages mentioned here do not work anymore. Here is a script I used, which worked as of 2023/07/17:
#!/bin/bash
# Load required software modules
module load python/3.10.9-fasrc01 cuda/11.8.0-fasrc01 cudnn/8.9.2.26_cuda11-fasrc01
# Create a new conda environment with Python and some additional packages
mamba create -n gpuenv python=3.10 pip numpy scipy pandas matplotlib seaborn h5py jupyter jupyterlab joblib tqdm statsmodels
# Activate the new conda environment
conda activate gpuenv
# Configure the system paths
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
# Re-activate the new conda environment
conda activate gpuenv
# Install TensorFlow
pip install --upgrade tensorflow==2.13
# Install JAX
pip install --upgrade "jax[cuda11_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
# Test TensorFlow
python -c "import tensorflow as tf; print('TensorFlow version:', tf.__version__); print('Devices:', tf.config.list_physical_devices('GPU'))"
# Test JAX
python -c "import jax; print('JAX version:', jax.__version__); print('Devices:', jax.devices())"
Good luck!
I see, with the cluster update, I think some packages are no longer there.
Thanks a lot for sharing the process that worked for you ❤️, I'll update the post with a link to this comment!
Thank you for this super useful explainer! 🙏
How to setup a JAX/Tensorflow 1.15 environment in the FASRC Cluster
https://ramith.fyi/how-to-setup-a-tensorflow-1-15-environment-in-the-fasrc-cluster/