Open daobilige-su opened 6 years ago
Hello,
I am not having any problems running the hello world. Can you check that you are using the proper docker version? You should be using nvidia-docker link, not the vanilla one.
You can check if your gpu functions are working inside the docker container running
$ nvidia-smi
Hi @tano297 ,
Thanks for your reply.
I finally made everything running now. It turns out that I need to delete /usr/local/cuda/lib64/stubs/libcuda.so.1 file to make tensorrt and tensorflow work. Also I need to recompile tensorflow C++ API by adding CC_OPT_FLAGS="-march=native" flag before compiling to support my CPU version.
It is a really nice software, enjoying it now. Thanks.
Cheers, Su
I'm glad to hear that! There are sometimes some caveats for each architecture, which I try to minimize, but they escape.
The /usr/local/cuda/lib64/stubs/libcuda.so.1
thing should definitely not be happening, so I will have a look into it. Leaving this issue open until I can reproduce it and fix it.
I'm glad to hear that! There are sometimes some caveats for each architecture, which I try to minimize, but they escape.
The
/usr/local/cuda/lib64/stubs/libcuda.so.1
thing should definitely not be happening, so I will have a look into it. Leaving this issue open until I can reproduce it and fix it.
Hello.
I have same error in docker.
In my case, Standalone examples don`t work.
When I execute ./build/bonnet_standalone/session, I got Illegal instruction (core dumped).
I checked
nvidia-smi Thu Oct 18 01:23:58 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.48 Driver Version: 410.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX TITAN Off | 00000000:01:00.0 On | N/A | | 30% 41C P8 18W / 250W | 585MiB / 6075MiB | 1% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1558 G /usr/lib/xorg/Xorg 233MiB | | 0 2304 G /opt/teamviewer/tv_bin/TeamViewer 20MiB | | 0 2578 G compiz 209MiB | | 0 23643 G ...quest-channel-token=6920769117415252391 72MiB | | 0 32701 G ...-token=B9940EAD24EB7BFE7CB48B880BC0A2AE 43MiB | +-----------------------------------------------------------------------------+
python3 import tensorflow it is ok.
helloworld.py under the \bonnet-docker folder it is ok.
I think c++ with tensorflow have some problem.
how to rebuild tensorflow C++ API by adding CC_OPT_FLAGS="-march=native" flag ??
Hi,
First, you need to make sure the problem arise from tensorflow C++ API. To do that, just run the test program of it.
$ cd /tools/tensorflow_cc/example
$ mkdir build && cd build
$ cmake ..
$ ./example
If the above test program is giving you the same error, then it is surely the tensorflow C++ API is the source of the error. Your CPU version is too old to be supported by the default configuration of tensorflow C++ API. To recompile tensorflow API, do followings:
$ cd /tools/tensorflow_cc/tensorflow_cc
$ mkdir build
$ cd build
$ export CC_OPT_FLAGS="-march=native"
$ cmake -DTENSORFLOW_STATIC=OFF -DTENSORFLOW_SHARED=ON -DTENSORFLOW_TAG="v1.7.0" ..
$ make -j
$ make install
$ rm -rf ~/.cache && cd .. && rm -rf build
after that, you might also needs to re-install tensorflow again, since the installation of tensorflow C++ API will install a different version of tensorflow, which you have to uninstall and install the correct version of tensorflow again.
RUN pip3 uninstall numpy tensorflow-gpu tensorflow matplotlib -y && \
pip3 install -U tensorflow-gpu==1.7.0 protobuf==3.5.1 matplotlib==2.2.2
Hopefully that's it.
Cheers, Su
I checked problem arise from tensorflow C++ API. Then, I tried to install tensorflow again. But, there are some errors while building tensorflow.
So, I tried to install docker to another computer and it is ok. you said " Your CPU version is too old to be supported by the default configuration of tensorflow C++ API. ", maybe it is right.
Thank you for answering.
I can confirm the issue. No problems following along the instructions on a more recent machine. However, I could not yet resolve all the dependencies for the steps @daobilige-su mentioned above. (Apparently one needs to also install g++-7, which is then in turn incompatible with the cuda libs "/usr/local/cuda-9.0/bin/../targets/x86_64-linux/include/crt/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 6 are not supported!")
Machine with the trouble is an Intel i7-2600K in case that helps anybody
Hi,
First, thanks for the software. Looks very cool.
I am using the docker image provided. But I have 2 questions about it.
After I read the dockerfile in the image, if I understood correctly, all dependencies are built in the image, but the actual bonnet is not installed inside the image. Is that correct? Because I could not find the lines correspond to installation of bonnet code. If so, do I have to install it by myself on top of the image?
I ran the helloworld.py under the \bonnet-docker folder of the image. Then I got Segmentation fault (core dumped) error. When I execute the code inside helloworld.py line by line, I come to know that it is the import tensorrt causing the error. Does it works fine on you machine?
Thanks for any help or suggestion.
Cheers, Su