spoonsso / dannce

MIT License
214 stars 30 forks source link

Support for CUDA 11 #101

Closed xmikezheng20 closed 2 years ago

xmikezheng20 commented 2 years ago

Hi,

We're trying to run dannce on a new RTX A4000 GPU, which works with CUDA 11. We're having some issues getting the card to work with CUDA 10.1, the version specified for dannce v1.2.0. But updating CUDA to 11 by itself causes other installation issues with cudnn, tensorflow, and pytorch.

I emailed Diego and he said these dependency issues will be solved in dannce v1.3 soon, and he suggested I open an issue here.

Thank you!

diegoaldarondo commented 2 years ago

Hey!

I will reply here again when I know the release date for the next dannce version.

The release_development branch should support everything you need in the meantime. We run this branch on many different gpus, including A-series, so it should work fine. From the dannce repository, access it with git fetch; git checkout release_development.

I would do a clean install of your dannce environment. You might also need to update your conda.

xmikezheng20 commented 2 years ago

I did a clean install of the release_development branch. I then fine-tuned the 3-cam AVG model using 424 labeled frames. All com-train, com-predict, dannce-train, dannce-predict ran through. However, while setting up the CUDA stuff at the beginning of running these commands, a lot of these same warnings are streamed. The program ran through just fine, but I’m wondering if these errors are known issues with dependency versions, and if they will affect the performance?

2022-04-01 10:10:22.218621: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2022-04-01 10:10:22.343304: W tensorflow/stream_executor/gpu/asm_compiler.cc:235] Your CUDA software stack is old. We fallback to the NVIDIA driver for some compilation. Update your CUDA version to get the best performance. The ptxas error was: ptxas fatal   : Value 'sm_86' is not defined for option 'gpu-name'

Below is the packages in my conda environment:

# packages in environment at /home/xizheng/.conda/envs/dannce:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
absl-py                   0.15.0                   pypi_0    pypi
aom                       3.3.0                h27087fc_1    conda-forge
astunparse                1.6.3                    pypi_0    pypi
attr                      0.3.1                    pypi_0    pypi
attrs                     21.4.0                   pypi_0    pypi
blas                      1.0                         mkl
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2022.3.18            h06a4308_0
cachetools                5.0.0                    pypi_0    pypi
certifi                   2021.10.8                pypi_0    pypi
charset-normalizer        2.0.12                   pypi_0    pypi
cudatoolkit               11.1.74              h6bb024c_0    nvidia
cudnn                     8.1.0.77             h90431f1_0    conda-forge
cycler                    0.11.0                   pypi_0    pypi
dannce                    1.3.0.post0               dev_0    <develop>
dill                      0.3.4                    pypi_0    pypi
ffmpeg                    5.0.0                h594f047_1    conda-forge
flatbuffers               1.12                     pypi_0    pypi
fonttools                 4.31.2                   pypi_0    pypi
freetype                  2.10.4               h0708190_1    conda-forge
gast                      0.3.3                    pypi_0    pypi
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.6.13               h85f3911_1    conda-forge
google-auth               2.6.2                    pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.32.0                   pypi_0    pypi
h5py                      2.10.0                   pypi_0    pypi
icu                       69.1                 h9c3ff4c_0    conda-forge
idna                      3.3                      pypi_0    pypi
imageio                   2.8.0                    pypi_0    pypi
imageio-ffmpeg            0.4.5                    pypi_0    pypi
importlib-metadata        4.11.3                   pypi_0    pypi
intel-openmp              2022.0.1          h06a4308_3633
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.4.0                    pypi_0    pypi
lame                      3.100             h7f98852_1001    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libdrm                    2.4.109              h7f98852_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 11.2.0              h1d223b6_14    conda-forge
libgomp                   11.2.0              h1d223b6_14    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libpciaccess              0.16                 h516909a_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_14    conda-forge
libuv                     1.40.0               h7b6447c_0
libva                     2.14.0               h7f98852_0    conda-forge
libvpx                    1.11.0               h9c3ff4c_3    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxml2                   2.9.12               h885dcf4_1    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
markdown                  3.3.6                    pypi_0    pypi
matplotlib                3.5.1                    pypi_0    pypi
mkl                       2022.0.1           h06a4308_117
multiprocess              0.70.12.2                pypi_0    pypi
ncurses                   6.3                  h9c3ff4c_0    conda-forge
nettle                    3.6                  he412f7d_0    conda-forge
networkx                  2.6.3                    pypi_0    pypi
ninja                     1.10.2           py37hd09550d_3
numpy                     1.19.5                   pypi_0    pypi
oauthlib                  3.2.0                    pypi_0    pypi
opencv-python             4.5.5.64                 pypi_0    pypi
openh264                  2.1.1                h780b84a_0    conda-forge
openssl                   3.0.2                h166bdaf_1    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
packaging                 21.3                     pypi_0    pypi
pillow                    9.0.1                    pypi_0    pypi
pip                       22.0.4             pyhd8ed1ab_0    conda-forge
protobuf                  3.19.4                   pypi_0    pypi
psutil                    5.9.0                    pypi_0    pypi
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pyparsing                 3.0.7                    pypi_0    pypi
python                    3.7.12          hf930737_100_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytorch                   1.9.1           py3.7_cuda11.1_cudnn8.0.5_0    pytorch
pywavelets                1.3.0                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.27.1                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rsa                       4.8                      pypi_0    pypi
scikit-image              0.19.2                   pypi_0    pypi
scipy                     1.7.3                    pypi_0    pypi
setuptools                61.0.0                   pypi_0    pypi
six                       1.15.0                   pypi_0    pypi
sqlite                    3.37.1               h4ff8645_0    conda-forge
svt-av1                   0.9.1                h27087fc_0    conda-forge
tensorboard               2.8.0                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tensorflow                2.4.0                    pypi_0    pypi
tensorflow-estimator      2.4.0                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
tifffile                  2021.11.2                pypi_0    pypi
tk                        8.6.12               h27826a3_0    conda-forge
typing-extensions         3.7.4.3                  pypi_0    pypi
urllib3                   1.26.9                   pypi_0    pypi
werkzeug                  2.0.3                    pypi_0    pypi
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
wrapt                     1.12.1                   pypi_0    pypi
x264                      1!161.3030           h7f98852_1    conda-forge
x265                      3.5                  h924138e_3    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libx11               1.7.2                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zipp                      3.7.0                    pypi_0    pypi
zlib                      1.2.11            h36c2ea0_1013    conda-forge
spoonsso commented 2 years ago

I have never seen this fallback error before. It would only affect the speed of training/prediction. How many seconds did it take to run 10 batches in dannce-predict, and how large was your batch size? This time should be printed out while running dannce-predict.

On Fri, Apr 1, 2022 at 10:45 AM Mike Zheng @.***> wrote:

I did a clean install of the release_development branch. I then fine-tuned the 3-cam AVG model using 424 labeled frames. All com-train, com-predict, dannce-train, dannce-predict ran through. However, while setting up the CUDA stuff at the beginning of running these commands, a lot of these same warnings are streamed. The program ran through just fine, but I’m wondering if these errors are known issues with dependency versions, and if they will affect the performance?

2022-04-01 10:10:22.218621: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED

2022-04-01 10:10:22.343304: W tensorflow/stream_executor/gpu/asm_compiler.cc:235] Your CUDA software stack is old. We fallback to the NVIDIA driver for some compilation. Update your CUDA version to get the best performance. The ptxas error was: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'

Below is the packages in my conda environment:

packages in environment at /home/xizheng/.conda/envs/dannce:

#

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge

_openmp_mutex 4.5 1_gnu conda-forge

absl-py 0.15.0 pypi_0 pypi

aom 3.3.0 h27087fc_1 conda-forge

astunparse 1.6.3 pypi_0 pypi

attr 0.3.1 pypi_0 pypi

attrs 21.4.0 pypi_0 pypi

blas 1.0 mkl

bzip2 1.0.8 h7f98852_4 conda-forge

ca-certificates 2022.3.18 h06a4308_0

cachetools 5.0.0 pypi_0 pypi

certifi 2021.10.8 pypi_0 pypi

charset-normalizer 2.0.12 pypi_0 pypi

cudatoolkit 11.1.74 h6bb024c_0 nvidia

cudnn 8.1.0.77 h90431f1_0 conda-forge

cycler 0.11.0 pypi_0 pypi

dannce 1.3.0.post0 dev_0

dill 0.3.4 pypi_0 pypi

ffmpeg 5.0.0 h594f047_1 conda-forge

flatbuffers 1.12 pypi_0 pypi

fonttools 4.31.2 pypi_0 pypi

freetype 2.10.4 h0708190_1 conda-forge

gast 0.3.3 pypi_0 pypi

gmp 6.2.1 h58526e2_0 conda-forge

gnutls 3.6.13 h85f3911_1 conda-forge

google-auth 2.6.2 pypi_0 pypi

google-auth-oauthlib 0.4.6 pypi_0 pypi

google-pasta 0.2.0 pypi_0 pypi

grpcio 1.32.0 pypi_0 pypi

h5py 2.10.0 pypi_0 pypi

icu 69.1 h9c3ff4c_0 conda-forge

idna 3.3 pypi_0 pypi

imageio 2.8.0 pypi_0 pypi

imageio-ffmpeg 0.4.5 pypi_0 pypi

importlib-metadata 4.11.3 pypi_0 pypi

intel-openmp 2022.0.1 h06a4308_3633

keras-preprocessing 1.1.2 pypi_0 pypi

kiwisolver 1.4.0 pypi_0 pypi

lame 3.100 h7f98852_1001 conda-forge

ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge

libdrm 2.4.109 h7f98852_0 conda-forge

libffi 3.4.2 h7f98852_5 conda-forge

libgcc-ng 11.2.0 h1d223b6_14 conda-forge

libgomp 11.2.0 h1d223b6_14 conda-forge

libiconv 1.16 h516909a_0 conda-forge

libnsl 2.0.0 h7f98852_0 conda-forge

libpciaccess 0.16 h516909a_0 conda-forge

libpng 1.6.37 h21135ba_2 conda-forge

libstdcxx-ng 11.2.0 he4da1e4_14 conda-forge

libuv 1.40.0 h7b6447c_0

libva 2.14.0 h7f98852_0 conda-forge

libvpx 1.11.0 h9c3ff4c_3 conda-forge

libxcb 1.13 h7f98852_1004 conda-forge

libxml2 2.9.12 h885dcf4_1 conda-forge

libzlib 1.2.11 h36c2ea0_1013 conda-forge

markdown 3.3.6 pypi_0 pypi

matplotlib 3.5.1 pypi_0 pypi

mkl 2022.0.1 h06a4308_117

multiprocess 0.70.12.2 pypi_0 pypi

ncurses 6.3 h9c3ff4c_0 conda-forge

nettle 3.6 he412f7d_0 conda-forge

networkx 2.6.3 pypi_0 pypi

ninja 1.10.2 py37hd09550d_3

numpy 1.19.5 pypi_0 pypi

oauthlib 3.2.0 pypi_0 pypi

opencv-python 4.5.5.64 pypi_0 pypi

openh264 2.1.1 h780b84a_0 conda-forge

openssl 3.0.2 h166bdaf_1 conda-forge

opt-einsum 3.3.0 pypi_0 pypi

packaging 21.3 pypi_0 pypi

pillow 9.0.1 pypi_0 pypi

pip 22.0.4 pyhd8ed1ab_0 conda-forge

protobuf 3.19.4 pypi_0 pypi

psutil 5.9.0 pypi_0 pypi

pthread-stubs 0.4 h36c2ea0_1001 conda-forge

pyasn1 0.4.8 pypi_0 pypi

pyasn1-modules 0.2.8 pypi_0 pypi

pyparsing 3.0.7 pypi_0 pypi

python 3.7.12 hf930737_100_cpython conda-forge

python-dateutil 2.8.2 pypi_0 pypi

python_abi 3.7 2_cp37m conda-forge

pytorch 1.9.1 py3.7_cuda11.1_cudnn8.0.5_0 pytorch

pywavelets 1.3.0 pypi_0 pypi

pyyaml 6.0 pypi_0 pypi

readline 8.1 h46c0cb4_0 conda-forge

requests 2.27.1 pypi_0 pypi

requests-oauthlib 1.3.1 pypi_0 pypi

rsa 4.8 pypi_0 pypi

scikit-image 0.19.2 pypi_0 pypi

scipy 1.7.3 pypi_0 pypi

setuptools 61.0.0 pypi_0 pypi

six 1.15.0 pypi_0 pypi

sqlite 3.37.1 h4ff8645_0 conda-forge

svt-av1 0.9.1 h27087fc_0 conda-forge

tensorboard 2.8.0 pypi_0 pypi

tensorboard-data-server 0.6.1 pypi_0 pypi

tensorboard-plugin-wit 1.8.1 pypi_0 pypi

tensorflow 2.4.0 pypi_0 pypi

tensorflow-estimator 2.4.0 pypi_0 pypi

termcolor 1.1.0 pypi_0 pypi

tifffile 2021.11.2 pypi_0 pypi

tk 8.6.12 h27826a3_0 conda-forge

typing-extensions 3.7.4.3 pypi_0 pypi

urllib3 1.26.9 pypi_0 pypi

werkzeug 2.0.3 pypi_0 pypi

wheel 0.37.1 pyhd8ed1ab_0 conda-forge

wrapt 1.12.1 pypi_0 pypi

x264 1!161.3030 h7f98852_1 conda-forge

x265 3.5 h924138e_3 conda-forge

xorg-fixesproto 5.0 h7f98852_1002 conda-forge

xorg-kbproto 1.0.7 h7f98852_1002 conda-forge

xorg-libx11 1.7.2 h7f98852_0 conda-forge

xorg-libxau 1.0.9 h7f98852_0 conda-forge

xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge

xorg-libxext 1.3.4 h7f98852_1 conda-forge

xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge

xorg-xextproto 7.3.0 h7f98852_1002 conda-forge

xorg-xproto 7.0.31 h7f98852_1007 conda-forge

xz 5.2.5 h516909a_1 conda-forge

zipp 3.7.0 pypi_0 pypi

zlib 1.2.11 h36c2ea0_1013 conda-forge

— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1085990296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4ZPAG2QCXIPRBW4WQTVC4DZXANCNFSM5RSNYMNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

xmikezheng20 commented 2 years ago

Currently it takes ~8 seconds to run 10 batches. My batch size is 4.

spoonsso commented 2 years ago

What GPU are you using? On a Titan V it takes us ~4 seconds for the same.

On Fri, Apr 1, 2022 at 11:20 AM Mike Zheng @.***> wrote:

Currently it takes ~8 seconds to run 10 batches. My batch size is 4.

— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1086030827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4ZAKHTUEIAEDLQ3RUTVC4H3ZANCNFSM5RSNYMNA . You are receiving this because you commented.Message ID: @.***>

xmikezheng20 commented 2 years ago

We're using an RTX A4000, which was why I wanted to try this branch with newer CUDA support. I have 3 camera views, and I used nvox=80, as suggested by Diego. It would be great if the speed can improve a bit, but it's fine as it stands. I just wanted to raise this issue in case you've encountered it before

diegoaldarondo commented 2 years ago

Do you have another older cuda installed on your system?

xmikezheng20 commented 2 years ago

At the systems level, we installed CUDA 11.4 for running MATLAB. Otherwise, I previously made some other conda environments with older CUDA. Do you think they might interfere with the CUDA 11.1 in the dannce conda environment?

spoonsso commented 2 years ago

Right, my 4 second benchmark was nvox=64, so your speed actually doesn't surprise me. @diego what kind of speed do you normally see for nvox=80?

On Fri, Apr 1, 2022 at 11:43 AM Mike Zheng @.***> wrote:

At the systems level, we installed CUDA 11.4 for running MATLAB. Otherwise, I previously made some other conda environments with older CUDA. Do you think they might interfere with the CUDA 11.1 in the dannce conda environment?

— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1086056295, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4Y7SSZRI6E6WD3Q4NDVC4KQTANCNFSM5RSNYMNA . You are receiving this because you commented.Message ID: @.***>