Closed 3rang closed 3 years ago
2021-05-28 10:43:11.222119: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
Looks like it's failing to find some CUDA libs, I always had this error when training on a V100 instance unless I pasted this into the terminal:
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
You might have to replace depending on where your libs are located, but normally it should work like that. Paste and execute the command and try training again.
@3rang I saw this bug in your log:
Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
Can you fix this bug and run again ?
When I use GPU for training, I will report an error. Is my graphics card too low-level? The error is as follows.
My graphics cards are gtx1650 python 3.7 CUDA 10.1 cudnn 7.6.5 tensorflow-gpu 2.3
After that, I searched for some solutions on the Internet, but there is still no effective solution to this problem. So I set the parameter "batch_size" as small as possible, training can start to run, but after training for a while, it will report an error and stop training. The error is as follows.
@ruianfrp what is a total memory of ur GPU ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
I followed the commands as per README, but still, I'm facing some issues. maybe (i don't have any graphics card in my systems, and I have installed all dependencies as per documents.) Please find the below error log.
this same issue reported with this link, I tried with the same comments as well but it didn't work for me.
Error log