Closed alanshao023 closed 4 years ago
How did you run the docker container? Did you use nvidia-docker run ...
or docker run --gpus=all ...
?
I followed the readme description as below,
docker run -dt -e NVIDIA_VISIBLE_DEVICES=ALL -w /work \
--security-opt apparmor=unconfined --security-opt seccomp=unconfined \
-v $HOME:/mnt$HOME \
--name mlperf-inference-
Could you try adding --gpus=all
flag?
Sure. After I did that, in the new container, 1) I ran nvidia-smi
and the output was correct; 2) I ran ls /usr/local/cuda/include | grep cuda.h
and the output is
cuda.h.
Do it mean the nvidia docker working correctly?
yes, I think so. Could you try the make calibrate ...
command again to see if it works? Thanks
yes, I think so. Could you try the
make calibrate ...
command again to see if it works? Thanks
I have repeated the previous steps below in the new container,
1) Build Source Codes (in the docker I generated and the same for below) 2) Download and Preprocess Datasets 3) Download Benchmark Models
It worked. After these, I was able to run calibration, generate TensorRT engines and run the harness.
Thank you, nvpohanh.
The command I input in a container: make calibrate RUN_ARGS="--benchmarks=resnet"
The output: Traceback (most recent call last): File "code/main.py", line 327, in
main()
File "code/main.py", line 286, in main
config_files = find_config_files(benchmarks, scenarios)
File "/work/code/common/init.py", line 123, in find_config_files
system = get_system_id()
File "/work/code/common/init.py", line 102, in get_system_id
import pycuda.driver
File "/usr/local/lib/python3.6/dist-packages/pycuda/driver.py", line 5, in
from pycuda._driver import * # noqa
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
Makefile:309: recipe for target 'calibrate' failed
make: *** [calibrate] Error 1
In this container after I input nvidia-smi, the output is "command not found".
What I have finished so far 1) Build Docker Image 2) Build Source Codes (in the docker I generated and same for below) 3) Download and Preprocess Datasets 4) Download Benchmark Models
I'm using CentOS 7.8, Driver Version: 440.64.00, CUDA Version: 10.2, four NV T4 cards.
I'm new to docker, any suggestion is appreciated.