Closed xmikezheng20 closed 2 years ago
Hey!
I will reply here again when I know the release date for the next dannce version.
The release_development
branch should support everything you need in the meantime. We run this branch on many different gpus, including A-series, so it should work fine. From the dannce repository, access it with git fetch; git checkout release_development
.
I would do a clean install of your dannce environment. You might also need to update your conda.
I did a clean install of the release_development branch. I then fine-tuned the 3-cam AVG model using 424 labeled frames. All com-train, com-predict, dannce-train, dannce-predict ran through. However, while setting up the CUDA stuff at the beginning of running these commands, a lot of these same warnings are streamed. The program ran through just fine, but I’m wondering if these errors are known issues with dependency versions, and if they will affect the performance?
2022-04-01 10:10:22.218621: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2022-04-01 10:10:22.343304: W tensorflow/stream_executor/gpu/asm_compiler.cc:235] Your CUDA software stack is old. We fallback to the NVIDIA driver for some compilation. Update your CUDA version to get the best performance. The ptxas error was: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'
Below is the packages in my conda environment:
# packages in environment at /home/xizheng/.conda/envs/dannce:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
absl-py 0.15.0 pypi_0 pypi
aom 3.3.0 h27087fc_1 conda-forge
astunparse 1.6.3 pypi_0 pypi
attr 0.3.1 pypi_0 pypi
attrs 21.4.0 pypi_0 pypi
blas 1.0 mkl
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2022.3.18 h06a4308_0
cachetools 5.0.0 pypi_0 pypi
certifi 2021.10.8 pypi_0 pypi
charset-normalizer 2.0.12 pypi_0 pypi
cudatoolkit 11.1.74 h6bb024c_0 nvidia
cudnn 8.1.0.77 h90431f1_0 conda-forge
cycler 0.11.0 pypi_0 pypi
dannce 1.3.0.post0 dev_0 <develop>
dill 0.3.4 pypi_0 pypi
ffmpeg 5.0.0 h594f047_1 conda-forge
flatbuffers 1.12 pypi_0 pypi
fonttools 4.31.2 pypi_0 pypi
freetype 2.10.4 h0708190_1 conda-forge
gast 0.3.3 pypi_0 pypi
gmp 6.2.1 h58526e2_0 conda-forge
gnutls 3.6.13 h85f3911_1 conda-forge
google-auth 2.6.2 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.32.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
icu 69.1 h9c3ff4c_0 conda-forge
idna 3.3 pypi_0 pypi
imageio 2.8.0 pypi_0 pypi
imageio-ffmpeg 0.4.5 pypi_0 pypi
importlib-metadata 4.11.3 pypi_0 pypi
intel-openmp 2022.0.1 h06a4308_3633
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.4.0 pypi_0 pypi
lame 3.100 h7f98852_1001 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
libdrm 2.4.109 h7f98852_0 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 11.2.0 h1d223b6_14 conda-forge
libgomp 11.2.0 h1d223b6_14 conda-forge
libiconv 1.16 h516909a_0 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libpciaccess 0.16 h516909a_0 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libstdcxx-ng 11.2.0 he4da1e4_14 conda-forge
libuv 1.40.0 h7b6447c_0
libva 2.14.0 h7f98852_0 conda-forge
libvpx 1.11.0 h9c3ff4c_3 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.12 h885dcf4_1 conda-forge
libzlib 1.2.11 h36c2ea0_1013 conda-forge
markdown 3.3.6 pypi_0 pypi
matplotlib 3.5.1 pypi_0 pypi
mkl 2022.0.1 h06a4308_117
multiprocess 0.70.12.2 pypi_0 pypi
ncurses 6.3 h9c3ff4c_0 conda-forge
nettle 3.6 he412f7d_0 conda-forge
networkx 2.6.3 pypi_0 pypi
ninja 1.10.2 py37hd09550d_3
numpy 1.19.5 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
opencv-python 4.5.5.64 pypi_0 pypi
openh264 2.1.1 h780b84a_0 conda-forge
openssl 3.0.2 h166bdaf_1 conda-forge
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pypi_0 pypi
pillow 9.0.1 pypi_0 pypi
pip 22.0.4 pyhd8ed1ab_0 conda-forge
protobuf 3.19.4 pypi_0 pypi
psutil 5.9.0 pypi_0 pypi
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pyparsing 3.0.7 pypi_0 pypi
python 3.7.12 hf930737_100_cpython conda-forge
python-dateutil 2.8.2 pypi_0 pypi
python_abi 3.7 2_cp37m conda-forge
pytorch 1.9.1 py3.7_cuda11.1_cudnn8.0.5_0 pytorch
pywavelets 1.3.0 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
readline 8.1 h46c0cb4_0 conda-forge
requests 2.27.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.8 pypi_0 pypi
scikit-image 0.19.2 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
setuptools 61.0.0 pypi_0 pypi
six 1.15.0 pypi_0 pypi
sqlite 3.37.1 h4ff8645_0 conda-forge
svt-av1 0.9.1 h27087fc_0 conda-forge
tensorboard 2.8.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.4.0 pypi_0 pypi
tensorflow-estimator 2.4.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tifffile 2021.11.2 pypi_0 pypi
tk 8.6.12 h27826a3_0 conda-forge
typing-extensions 3.7.4.3 pypi_0 pypi
urllib3 1.26.9 pypi_0 pypi
werkzeug 2.0.3 pypi_0 pypi
wheel 0.37.1 pyhd8ed1ab_0 conda-forge
wrapt 1.12.1 pypi_0 pypi
x264 1!161.3030 h7f98852_1 conda-forge
x265 3.5 h924138e_3 conda-forge
xorg-fixesproto 5.0 h7f98852_1002 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zipp 3.7.0 pypi_0 pypi
zlib 1.2.11 h36c2ea0_1013 conda-forge
I have never seen this fallback error before. It would only affect the speed of training/prediction. How many seconds did it take to run 10 batches in dannce-predict, and how large was your batch size? This time should be printed out while running dannce-predict.
On Fri, Apr 1, 2022 at 10:45 AM Mike Zheng @.***> wrote:
I did a clean install of the release_development branch. I then fine-tuned the 3-cam AVG model using 424 labeled frames. All com-train, com-predict, dannce-train, dannce-predict ran through. However, while setting up the CUDA stuff at the beginning of running these commands, a lot of these same warnings are streamed. The program ran through just fine, but I’m wondering if these errors are known issues with dependency versions, and if they will affect the performance?
2022-04-01 10:10:22.218621: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2022-04-01 10:10:22.343304: W tensorflow/stream_executor/gpu/asm_compiler.cc:235] Your CUDA software stack is old. We fallback to the NVIDIA driver for some compilation. Update your CUDA version to get the best performance. The ptxas error was: ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name'
Below is the packages in my conda environment:
packages in environment at /home/xizheng/.conda/envs/dannce:
#
Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
absl-py 0.15.0 pypi_0 pypi
aom 3.3.0 h27087fc_1 conda-forge
astunparse 1.6.3 pypi_0 pypi
attr 0.3.1 pypi_0 pypi
attrs 21.4.0 pypi_0 pypi
blas 1.0 mkl
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2022.3.18 h06a4308_0
cachetools 5.0.0 pypi_0 pypi
certifi 2021.10.8 pypi_0 pypi
charset-normalizer 2.0.12 pypi_0 pypi
cudatoolkit 11.1.74 h6bb024c_0 nvidia
cudnn 8.1.0.77 h90431f1_0 conda-forge
cycler 0.11.0 pypi_0 pypi
dannce 1.3.0.post0 dev_0
dill 0.3.4 pypi_0 pypi
ffmpeg 5.0.0 h594f047_1 conda-forge
flatbuffers 1.12 pypi_0 pypi
fonttools 4.31.2 pypi_0 pypi
freetype 2.10.4 h0708190_1 conda-forge
gast 0.3.3 pypi_0 pypi
gmp 6.2.1 h58526e2_0 conda-forge
gnutls 3.6.13 h85f3911_1 conda-forge
google-auth 2.6.2 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.32.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
icu 69.1 h9c3ff4c_0 conda-forge
idna 3.3 pypi_0 pypi
imageio 2.8.0 pypi_0 pypi
imageio-ffmpeg 0.4.5 pypi_0 pypi
importlib-metadata 4.11.3 pypi_0 pypi
intel-openmp 2022.0.1 h06a4308_3633
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.4.0 pypi_0 pypi
lame 3.100 h7f98852_1001 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
libdrm 2.4.109 h7f98852_0 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 11.2.0 h1d223b6_14 conda-forge
libgomp 11.2.0 h1d223b6_14 conda-forge
libiconv 1.16 h516909a_0 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libpciaccess 0.16 h516909a_0 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libstdcxx-ng 11.2.0 he4da1e4_14 conda-forge
libuv 1.40.0 h7b6447c_0
libva 2.14.0 h7f98852_0 conda-forge
libvpx 1.11.0 h9c3ff4c_3 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.12 h885dcf4_1 conda-forge
libzlib 1.2.11 h36c2ea0_1013 conda-forge
markdown 3.3.6 pypi_0 pypi
matplotlib 3.5.1 pypi_0 pypi
mkl 2022.0.1 h06a4308_117
multiprocess 0.70.12.2 pypi_0 pypi
ncurses 6.3 h9c3ff4c_0 conda-forge
nettle 3.6 he412f7d_0 conda-forge
networkx 2.6.3 pypi_0 pypi
ninja 1.10.2 py37hd09550d_3
numpy 1.19.5 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
opencv-python 4.5.5.64 pypi_0 pypi
openh264 2.1.1 h780b84a_0 conda-forge
openssl 3.0.2 h166bdaf_1 conda-forge
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pypi_0 pypi
pillow 9.0.1 pypi_0 pypi
pip 22.0.4 pyhd8ed1ab_0 conda-forge
protobuf 3.19.4 pypi_0 pypi
psutil 5.9.0 pypi_0 pypi
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pyparsing 3.0.7 pypi_0 pypi
python 3.7.12 hf930737_100_cpython conda-forge
python-dateutil 2.8.2 pypi_0 pypi
python_abi 3.7 2_cp37m conda-forge
pytorch 1.9.1 py3.7_cuda11.1_cudnn8.0.5_0 pytorch
pywavelets 1.3.0 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
readline 8.1 h46c0cb4_0 conda-forge
requests 2.27.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.8 pypi_0 pypi
scikit-image 0.19.2 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
setuptools 61.0.0 pypi_0 pypi
six 1.15.0 pypi_0 pypi
sqlite 3.37.1 h4ff8645_0 conda-forge
svt-av1 0.9.1 h27087fc_0 conda-forge
tensorboard 2.8.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.4.0 pypi_0 pypi
tensorflow-estimator 2.4.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tifffile 2021.11.2 pypi_0 pypi
tk 8.6.12 h27826a3_0 conda-forge
typing-extensions 3.7.4.3 pypi_0 pypi
urllib3 1.26.9 pypi_0 pypi
werkzeug 2.0.3 pypi_0 pypi
wheel 0.37.1 pyhd8ed1ab_0 conda-forge
wrapt 1.12.1 pypi_0 pypi
x264 1!161.3030 h7f98852_1 conda-forge
x265 3.5 h924138e_3 conda-forge
xorg-fixesproto 5.0 h7f98852_1002 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zipp 3.7.0 pypi_0 pypi
zlib 1.2.11 h36c2ea0_1013 conda-forge
— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1085990296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4ZPAG2QCXIPRBW4WQTVC4DZXANCNFSM5RSNYMNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Currently it takes ~8 seconds to run 10 batches. My batch size is 4.
What GPU are you using? On a Titan V it takes us ~4 seconds for the same.
On Fri, Apr 1, 2022 at 11:20 AM Mike Zheng @.***> wrote:
Currently it takes ~8 seconds to run 10 batches. My batch size is 4.
— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1086030827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4ZAKHTUEIAEDLQ3RUTVC4H3ZANCNFSM5RSNYMNA . You are receiving this because you commented.Message ID: @.***>
We're using an RTX A4000, which was why I wanted to try this branch with newer CUDA support. I have 3 camera views, and I used nvox=80, as suggested by Diego. It would be great if the speed can improve a bit, but it's fine as it stands. I just wanted to raise this issue in case you've encountered it before
Do you have another older cuda installed on your system?
At the systems level, we installed CUDA 11.4 for running MATLAB. Otherwise, I previously made some other conda environments with older CUDA. Do you think they might interfere with the CUDA 11.1 in the dannce conda environment?
Right, my 4 second benchmark was nvox=64, so your speed actually doesn't surprise me. @diego what kind of speed do you normally see for nvox=80?
On Fri, Apr 1, 2022 at 11:43 AM Mike Zheng @.***> wrote:
At the systems level, we installed CUDA 11.4 for running MATLAB. Otherwise, I previously made some other conda environments with older CUDA. Do you think they might interfere with the CUDA 11.1 in the dannce conda environment?
— Reply to this email directly, view it on GitHub https://github.com/spoonsso/dannce/issues/101#issuecomment-1086056295, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2P4Y7SSZRI6E6WD3Q4NDVC4KQTANCNFSM5RSNYMNA . You are receiving this because you commented.Message ID: @.***>
Hi,
We're trying to run dannce on a new RTX A4000 GPU, which works with CUDA 11. We're having some issues getting the card to work with CUDA 10.1, the version specified for dannce v1.2.0. But updating CUDA to 11 by itself causes other installation issues with cudnn, tensorflow, and pytorch.
I emailed Diego and he said these dependency issues will be solved in dannce v1.3 soon, and he suggested I open an issue here.
Thank you!