ctuning / ck-tensorrt

Collective Knowledge repository for NVIDIA's TensorRT
BSD 3-Clause "New" or "Revised" License
37 stars 16 forks source link

Benchmarking script can pick CUDA dependency arbitrarily #14

Open psyhtest opened 6 years ago

psyhtest commented 6 years ago

TensorRT 3.0 only works with CUDA 9.0. However, I currently have both CUDA 9.0 and CUDA 9.1 installed:

$ ck show env --tags=cuda
Env UID:         Target OS: Bits: Name:                Version: Tags:

6f75374461b20f48   linux-64    64 cuDNN library        7.0.5    64bits,cuda,cudnn,dnn,host-os-linux-64,lib,target-os-linux-64,v7,v7.0,v7.0.5
ee992601ced4718a   linux-64    64 Nvidia CUDA Compiler 9.1.85   64bits,compiler,cuda,host-os-linux-64,lang-c-cuda,lang-cpp-cuda,target-os-linux-64,v9,v9.1,v9.1.85
aa9bd57172aefafe   linux-64    64 Nvidia CUDA Compiler 9.0.176  64bits,compiler,cuda,host-os-linux-64,lang-c-cuda,lang-cpp-cuda,target-os-linux-64,v9,v9.0,v9.0.176

The benchmarking script seems to arbitrarily pick either version:

$ grep cuda-compiler * -R -A2
bvlc-alexnet-tensorrt-3.0.1/pipeline.json:    "cuda-compiler": {
bvlc-alexnet-tensorrt-3.0.1/pipeline.json-      "sort": 20, 
bvlc-alexnet-tensorrt-3.0.1/pipeline.json-      "detected_ver": "9.0.176", 
--
bvlc-googlenet-tensorrt-3.0.1/pipeline.json:    "cuda-compiler": {
bvlc-googlenet-tensorrt-3.0.1/pipeline.json-      "sort": 20, 
bvlc-googlenet-tensorrt-3.0.1/pipeline.json-      "detected_ver": "9.1.85", 
--
deepscale-squeezenet-1.1-tensorrt-3.0.1/pipeline.json:    "cuda-compiler": {
deepscale-squeezenet-1.1-tensorrt-3.0.1/pipeline.json-      "sort": 20, 
deepscale-squeezenet-1.1-tensorrt-3.0.1/pipeline.json-      "detected_ver": "9.0.176", 

When CK picks CUDA 9.1, compilation fails:

      "compile": {
        "compilation_success": "no",
        "compilation_success_bool": false,
        "compilation_time": 0.8694801330566406,
        "compilation_time_with_module": 2.111440896987915,
        "fail_reason": "return code 1 !=0 ",
        "joined_compiler_flags": "-O3"
      }

A workaround is to temporarily "hide" CUDA 9.1 e.g. as follows:

$ ck show env --tags=cuda,v9.1
Env UID:         Target OS: Bits: Name:                Version: Tags:

ee992601ced4718a   linux-64    64 Nvidia CUDA Compiler 9.1.85   64bits,compiler,cuda,host-os-linux-64,lang-c-cuda,lang-cpp-cuda,target-os-linux-64,v9,v9.1,v9.1.85
$ mv `ck find env:ee992601ced4718a`{,~}
$ ck show env --tags=cuda,v9.1
psyhtest commented 6 years ago

Of course, CK cannot generally know which CUDA version should be used. In this case, however, tensorrt-time's dependency jetson-inference was compiled with CUDA 9.0 specifically to avoid this issue. Still, CK attempted to use CUDA 9.1 and failed as a result.