PRBonn / bonnet

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.
GNU General Public License v3.0
325 stars 89 forks source link

F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms #11

Closed srcolinas closed 6 years ago

srcolinas commented 6 years ago

sudo nvidia-docker run -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/.Xauthority:/home/developer/.Xauthority -v /home/$USER/tests_bonnet:/home/developer/bonnet_wrkdir/tests --net=host --pid=host --ipc=host bonnet /bin/bash

$ sudo ./cnn_use.py -l ../tests/logs/ -p ../tests/pretrained/city_512 -i ../tests/images/0bd9-1520278753069_c.jpg

INTERFACE: Image to infer: ['../tests/images/0bd9-1520278753069_c.jpg'] Label: None Log dir: ../tests/logs/ model path ../tests/pretrained/city_512 model type iou data yaml: None net yaml: None train yaml: None Verbose?: False Features?: False Probabilities?: False

Opening default data file data.yaml from log folder Opening default net file net.yaml from log folder Opening default train file train.yaml from log folder Model folder exists! Using model from ../tests/pretrained/city_512/iou WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. Fetching dataset DEVICE AVAIL: /device:CPU:0 DEVICE AVAIL: /device:GPU:0 Initializing network Building graph encoder downsample1 W: [5, 5, 3, 13] Train: False . . .


Total number of parameters in network: 1,871,287


Predicting mask mask shape [1, 256, 512] Restoring checkpoint Looking for model in ../tests/pretrained/city_512/iou Retrieving model from: ../tests/pretrained/city_512/iou/model-best-iou.ckpt Successfully restored model weights! :D Saving this graph in ../tests/logs/ 2018-04-23 21:23:44.476628: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) Aborted (core dumped)

tano297 commented 6 years ago

This is usually caused by a mismatching version of libcudnn, and it shouldn't happen in the docker container. Are you using the latest version of the master branch?

tano297 commented 6 years ago

I am going to close this due to lack of activity, please reopen if you are still having problems.

AlexGhenno commented 5 years ago

Hi,

I'm also getting a similar error to this. I'm using the latest version of the master branch and the provided Docker image. At first, I got a GPU issue like in #25 , which I could solve by upgrading tensorflow. After that, I got this error while trying to train:

Training model Training network 1000 epochs (1000 iterations at batch size 10) Decaying learn rate by 1.500000 every 10 epochs (10 steps) 2019-07-26 11:44:00.238161: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) Aborted Do yo have any idea on what could be the cause?

Thanks a lot

AlexGhenno commented 5 years ago

I found a solution, I installed the version 1.12 of tensorflow:

sudo -H pip3 install tensorflow-gpu==1.12

Best regards