Closed hengxinCheung closed 1 year ago
It seems that there is something wrong with the Onnx-Parser. Can you try it with DockerFile in TPAT ? Or you can use 'trtexec --plugins' to test this Plugin.
@buptqq Thanks for your reply, and:
trtexec --plugins
got same fault.Would you give me some other helpful advice?
Based on past experience, Segmentation fault usually caused by different tensorrt versions(build Plugin and use Plugin). So please check the 'TRT_LIB_PATH' in 'python/trt_plugin/Makefile', Make sure your plugin is compiled with TensorRT-8.2.3
@buptqq Thanks for your reply, and:
- The docker image used is built from the TPAT dockerfile, only cuda and tensorrt have been changed;
- use
trtexec --plugins
got same fault.Would you give me some other helpful advice?
@buptqq
I did run the example step by step as described in the repo, so I did set the TRT_LIB_PATH
pointing to TensorRT-8.2.3. Could you provide a docker image with the above environment that can run example successfully.
@buptqq I did run the example step by step as described in the repo, so I did set the
TRT_LIB_PATH
pointing to TensorRT-8.2.3. Could you provide a docker image with the above environment that can run example successfully.
wait a minute i will let my colleague provide this docker image which with TensorRT-8.2.3. @wenqf11
@buptqq I did run the example step by step as described in the repo, so I did set the
TRT_LIB_PATH
pointing to TensorRT-8.2.3. Could you provide a docker image with the above environment that can run example successfully.
@hengxinCheung You can use nvcr.io/nvidia/tensorflow:22.03-tf1-py3
in Dockerfile(which is TensorRT 8.2.3, from https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags) and build a docker for TPAT. When you run example in new Docker, change the following config in python/trt_plugin/Makefile
CUDA_PATH = /usr/local/cuda/
TRT_LIB_PATH = /usr/lib/x86_64-linux-gnu
@buptqq I did run the example step by step as described in the repo, so I did set the
TRT_LIB_PATH
pointing to TensorRT-8.2.3. Could you provide a docker image with the above environment that can run example successfully.@hengxinCheung You can use
nvcr.io/nvidia/tensorflow:22.03-tf1-py3
in Dockerfile(which is TensorRT 8.2.3, from https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags) and build a docker for TPAT. When you run example in new Docker, change the following config inpython/trt_plugin/Makefile
CUDA_PATH = /usr/local/cuda/ TRT_LIB_PATH = /usr/lib/x86_64-linux-gnu
I rebuild the docker image and run the example with your advice, but got the following error (my device is GeForce-RTX-3090.):
4: tvm::build(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
3: tvm::codegen::Build(tvm::IRModule, tvm::Target)
2: tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::runtime::Module (*)(tvm::IRModule, tvm::Target)>(tvm::runtime::Module (*)(tvm::IRModule, tvm::Target), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const
1: tvm::codegen::BuildCUDA(tvm::IRModule, tvm::Target)
0: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), TVMFuncCreateFromCFunc::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
File "/workspace/TPAT/3rdparty/blazerml-tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
rv = local_pyfunc(*pyargs)
File "/workspace/TPAT/3rdparty/blazerml-tvm/python/tvm/autotvm/measure/measure_methods.py", line 789, in tvm_callback_cuda_compile
ptx = nvcc.compile_cuda(code, target=target, arch=AutotvmGlobalScope.current.cuda_target_arch)
File "/workspace/TPAT/3rdparty/blazerml-tvm/python/tvm/contrib/nvcc.py", line 108, in compile_cuda
raise RuntimeError(msg)
RuntimeError:
#ifdef _WIN32
using uint = unsigned int;
using uchar = unsigned char;
using ushort = unsigned short;
using int64_t = long long;
using uint64_t = unsigned long long;
#else
#define uint unsigned int
#define uchar unsigned char
#define ushort unsigned short
#define int64_t long long
#define uint64_t unsigned long long
#endif
extern "C" __global__ void __launch_bounds__(1024) tvmgen_default_fused_one_hot_kernel0(float* __restrict__ T_one_hot, int* __restrict__ placeholder, float* __restrict__ placeholder1, float* __restrict__ placeholder2) {
T_one_hot[(((((int)blockIdx.x) * 1024) + ((int)threadIdx.x)))] = ((placeholder[(((((int)blockIdx.x) * 4) + (((int)threadIdx.x) >> 8)))] == (((int)threadIdx.x) & 255)) ? placeholder1[(0)] : placeholder2[(0)]);
}
Compilation error:
nvcc fatal : Value 'sm_86' is not defined for option 'gpu-architecture'
And I found that the version of TensorRT in the docker container maybe is 7, not 8:
# pip list | grep tensorrt
tensorrt 7.0.0.11
# ls /usr/lib/x86_64-linux-gnu/ | grep nvinfer
libnvinfer.so
libnvinfer.so.7
libnvinfer.so.7.0.0
libnvinfer_plugin.so
libnvinfer_plugin.so.7
libnvinfer_plugin.so.7.0.0
@hengxinCheung You can check trt in nvcr.io/nvidia/tensorflow:22.03-tf1-py3
, must be tensorrt 8.2.3
@hengxinCheung You can check trt in
nvcr.io/nvidia/tensorflow:22.03-tf1-py3
, must be tensorrt 8.2.3
I double check the image, and I still think that the version of TensorRT is 7. Maybe the docker image you suggested is nvcr.io/nvidia/tensorrt:xx
, not nvcr.io/nvidia/tensorflow:xx
. For the above nvcc fatal, I try to add the following code in python/cuda_kernel.py
, but also can not build the TensorRT engine sucessfully:
from tvm.autotvm.measure.measure_methods import set_cuda_target_arch
# the value 'sm_75' appear in file `python/trt_plugin/Makefile` (line 75)
set_cuda_target_arch('sm_75')
I will close this issue, and prepare to implement OneHot plugin by myself. Thanks for all the replies and best wishes. And I found some mistake in this example (maybe):
# inconsistent batch_size before and after
line 230: input_model_file, output_model_file, node_names=node_names, dynamic_bs=dynamic, min_bs=1, max_bs=256, opt_bs=128
line 243: builder.max_batch_size = 1024
line 251: profile.set_shape(input.name, [1] + shape_without_batch, [256] + shape_without_batch, [256] + shape_without_batch )
@hengxinCheung Please make sure you are using 22.03-tf1-py3, not 20.03-tf1-py3
docker pull nvcr.io/nvidia/tensorflow:22.03-tf1-py3
git clone https://github.com/Tencent/TPAT
cd TPAT/
vim Dockerfile (modified 20.03 to 22.03)
docker build -f Dockerfile -t tensorflow-tpat:22.03-tf1-py3 .
nvidia-docker run -it --rm -v your_tpat_dir:tpat_dir_in_container --network=host tensorflow-tpat:22.03-tf1-py3 bash
@hengxinCheung Please make sure you are using 22.03-tf1-py3, not 20.03-tf1-py3
docker pull nvcr.io/nvidia/tensorflow:22.03-tf1-py3 git clone https://github.com/Tencent/TPAT cd TPAT/ vim Dockerfile (modified 20.03 to 22.03) docker build -f Dockerfile -t tensorflow-tpat:22.03-tf1-py3 . nvidia-docker run -it --rm -v your_tpat_dir:tpat_dir_in_container --network=host tensorflow-tpat:22.03-tf1-py3 bash
@wenqf11 Thanks for your helpful advice. I did write the tag of the image wrong, and I run the example successfully with the suitable image. It looks like that previous errors were all due to version mismatch (driver && cuda && tensorrt). But I got big result difference when I build the TensorRT engine for my model (a bert model) with plugins generating by TPAT. I am trying to solve this problem.
请确保您使用的是 22.03-tf1-py3,而不是 20.03-tf1-py3
docker pull nvcr.io/nvidia/tensorflow:22.03-tf1-py3 git clone https://github.com/Tencent/TPAT cd TPAT/ vim Dockerfile (modified 20.03 to 22.03) docker build -f Dockerfile -t tensorflow-tpat:22.03-tf1-py3 . nvidia-docker run -it --rm -v your_tpat_dir:tpat_dir_in_container --network=host tensorflow-tpat:22.03-tf1-py3 bash
感谢您的有用建议。我确实写错了图像的标签,并且我使用合适的图像成功运行了示例。看起来以前的错误都是由于版本不匹配(驱动程序 && cuda & 张量)造成的。但是当我使用TPAT生成的插件为我的模型(bert模型)构建TensorRT引擎时,我得到了很大的结果差异。我正在尝试解决这个问题。
@hengxinCheung 您好 您是仅仅通过修改镜像就可以运行test_onehot_dynamic_direct.py示例吗? 我使用了上述的解决方案但是运行该示例仍然遇到了 Value 'sm_89' is not defined for option 'gpu-architecture'
Description
I tried to run the example
test_onehot_dynamic_direct.py
, but got a segment fault. And I found this fault occurred inparser.parse(model.read())
(line 268). I would appreciate it if you could help me solve this problem.Environment
Log