Closed srcolinas closed 6 years ago
This is usually caused by a mismatching version of libcudnn, and it shouldn't happen in the docker container. Are you using the latest version of the master branch?
I am going to close this due to lack of activity, please reopen if you are still having problems.
Hi,
I'm also getting a similar error to this. I'm using the latest version of the master branch and the provided Docker image. At first, I got a GPU issue like in #25 , which I could solve by upgrading tensorflow. After that, I got this error while trying to train:
Training model
Training network 1000 epochs (1000 iterations at batch size 10)
Decaying learn rate by 1.500000 every 10 epochs (10 steps)
2019-07-26 11:44:00.238161: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo
Thanks a lot
I found a solution, I installed the version 1.12 of tensorflow:
sudo -H pip3 install tensorflow-gpu==1.12
Best regards
sudo nvidia-docker run -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/.Xauthority:/home/developer/.Xauthority -v /home/$USER/tests_bonnet:/home/developer/bonnet_wrkdir/tests --net=host --pid=host --ipc=host bonnet /bin/bash
$ sudo ./cnn_use.py -l ../tests/logs/ -p ../tests/pretrained/city_512 -i ../tests/images/0bd9-1520278753069_c.jpg
INTERFACE: Image to infer: ['../tests/images/0bd9-1520278753069_c.jpg'] Label: None Log dir: ../tests/logs/ model path ../tests/pretrained/city_512 model type iou data yaml: None net yaml: None train yaml: None Verbose?: False Features?: False Probabilities?: False
Opening default data file data.yaml from log folder Opening default net file net.yaml from log folder Opening default train file train.yaml from log folder Model folder exists! Using model from ../tests/pretrained/city_512/iou WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. Fetching dataset DEVICE AVAIL: /device:CPU:0 DEVICE AVAIL: /device:GPU:0 Initializing network Building graph encoder downsample1 W: [5, 5, 3, 13] Train: False . . .
Total number of parameters in network: 1,871,287
Predicting mask mask shape [1, 256, 512] Restoring checkpoint Looking for model in ../tests/pretrained/city_512/iou Retrieving model from: ../tests/pretrained/city_512/iou/model-best-iou.ckpt Successfully restored model weights! :D Saving this graph in ../tests/logs/ 2018-04-23 21:23:44.476628: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
Aborted (core dumped)