Tencent / TPAT

TensorRT Plugin Autogen Tool
Apache License 2.0
365 stars 42 forks source link

test_tpat error #16

Open hpz4311 opened 2 years ago

hpz4311 commented 2 years ago

I have build this project with required gcc=7.3.0 and llvm 9.0.1. my onnxruntime=1.9.0, onnx=1.10.0

when I run the test_tpat.py. I got following error: Traceback(most recent call last) file "test_tpad.py" line 3908, in test_abs() ............. File "../python/onnx_to_plugin.py", line 98, in onnx2plugin input_model_path, nodes, plugin_name_dict file "../python/onnx_to_plugin.py", line 43, in generate_plugin_library cuda_kernel.run() file "../python/cuda_kernel.py" in line 69, in run mod, params, self._target, include_simple_tasks = True, opt_level = op_level TypError:autoscheduler_get_tunning_tasks() got an unexpected keyword argument 'opt_level'

can anyone help? thankyou!

hpz4311 commented 2 years ago

it seems that autoscheduler_get_tunning_tasks do not have the default parameter "opt_level". I browsed all versions of tvm on GitHub, and did not find any version of the corresponding interface "autoscheduler_get_tunning_tasks" to receive the "opt_level" parameter. so what is the official tvm version and what should I do?

buptqq commented 2 years ago

it seems that autoscheduler_get_tunning_tasks do not have the default parameter "opt_level". I browsed all versions of tvm on GitHub, and did not find any version of the corresponding interface "autoscheduler_get_tunning_tasks" to receive the "opt_level" parameter. so what is the official tvm version and what should I do?

We have made some changes to the TVM source code. So you should try to use BlazerML-tvm(https://github.com/Tencent/TPAT/tree/main/3rdparty). We recommend using Dockerfile directly. you can found the Dockerfile at https://github.com/Tencent/TPAT. The Dockerfile is based on the mirrors of NVIDIA: FROM nvcr.io/nvidia/tensorflow:20.06-tf1-py3 you can choose the base mirrors in https://catalog.ngc.nvidia.com/containers Tips: the mirrors must be installed tensorflow and tensorrt.

hpz4311 commented 2 years ago

Thanks for your suggestion, I re-experimented as you suggested, but still got some errors. as follows.

WARNING:tensorflow:From test_tpat.py:33: The name tf.set_random_seed is deprecated. Please use tf.compat.vl.set_random_seed inste ad. WARNING:tensorflow:From test_tpat.py:36: The name tf.ConfigProto is deprecated. Please use tf.compat.vl.ConfigProto instead. begin to process 2022-08-11 16:33:11.870908627 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:5O9 CreateExecutionProviderlnstance) Failed to create TensorrtExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.htm l#requirements to ensure all dependencies are met. 2022-08-11 16:33:11.870959029 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:535 CreateExecutionProviderlnstance) Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.h tml#requirements to ensure all dependencies are met. Couldn't find reusable plugin for node abs_O Start auto-tuning! 2022-08-11 16:33:11.955133673 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:535 CreateExecutionProviderlnstance) Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.h tml#requirements to ensure all dependencies are met. URLError(gaierror(-2, 'Name or service not known')) Download attempt 0/3 failed, retrying. URLError(gaierror(-2, 'Name or service not known')) Download attempt 1/3 failed, retrying. WARNING:root:Failed to download tophub package for cuda: <urlopen error [Errno -2) Name or service not known> Compile... WARNING:auto_scheduler:/tmp/tuning.log does not exist! WARNING:download:URLError(gaierror(-2, 'Name or service not known')) Download attempt 0/3 failed, retrying. WARNING:download:URLError(gaierror(-2, 'Name or service not known')) Download attempt 1/3 failed, retrying. WARNING:root:Failed to download tophub package for cuda: <urlopen error [Errno -2) Name or service not known>

Running... 2022-08-11 16:33:13.053279847 LW:onnxruntijiie:DefauLt. onnxruntimejybind_state.cc:535 CreateExecutionProviderlnstancel Failed to create CI.OAExecutionProvider. Please reference https ://onnxruntie.ai/docs/reference/execution-providers/OLA-ExecutionProvider.h tmlsrequirements to ensure all, dependencies are met. 2022-08-11 16:33:13.348876189 [W:onnxruntuie:Default. onnxrunti.e_pybind_state.cc:535 CreateExecutionProviderlnstancel Failed to create CtLAExecutionProvider. Please reference https ://onnxrunti.ne.ai/docs/reference/execution-providers/OJ)A-ExecutionProvider.h tmlsrequirenients to ensure aU dependencies are met. rm •rf ./lib/tpat_abs_O.so ./obj/ if ( ! -d ./obj 1: then rkdir -p ./obj; fi /opt/Lib/cuda-10.2//bin/nvcc -w -std—c++11 .M -MT tpat_abs_0.o -I. .I/opt/lib/cuda10.2//s.ples/common/inc -I/opt/lib/cuda-10.2/ /include -I/opt/l.ib/cuda-11.1/include -I/home/bitbrain/pzhu/third_package/tensorrt/TensorRT-7.0.0.11/include -I/usr/include -o tp at_abs_0. d s rc/tpat_abs_0 . cu /opt/1.ib/cuda-10.2//bin/nvcc •w -std=c++11 -I. -I/opt/lib/cuda-10.2//samples/coon/inc -I/opt/Ub/cuda-10.2//include -1/opt/lib? cuda-li .1/include -I/ho.e/bitbrain/pzhu/thirdjackage/tensorrt/TensorRT-7.0.0. 11/include -I/usr/include .Xco.piler -fPIC -arch=sm _75 o tpat_abs_0.o -c src/tpat_abs_0.cu a /opt/lib/cuda-10.2//bin/nvcc •w -std.c++11 -I. -I/opt/lib/cuda-10.2//samples/coonhinc .I/opt/tib/cuda.10.2//incl.ude -I/opt/li b/cuda-11.1/include -I/home/bitbrain/pzhu/thirdj,ackage/tensorrt/TensorRT-7.0.0.11/include -I/usr/include -Xca.piler -fPIC -arch= sm_75 -G -lineinfo -o tpat_abs_0.o -c src/tpat_abs_O.cu g÷÷ -w -std=c÷+fl -shared -o tpat_abs_0.so tpat_abs_0.o -L/opt/lib/cuda-10.2//1ib64 -L/opt/lib/cuda-10.2//1ib64 -L/hooie/bitbrain/ pzhu/thirdjackage/tensorrt/TensorRT-7.0.0.11/lib -lnvinfer -lcudart -icuda -WL-rpath—/opt/lib/cuda-10.2//1ib64 -Wi. -rpath./opt /lib/cuda10.2//lib64 -WI. .rpath/ho.e/bitbrain/pzhu/thirdpackage/tensorrtfTensorRT-7.0.0.11/lib if ( .d ./lib ); then mkdir p ./lib; fi my .o ./obj/ my *.d ./obj/ my .so ./lib/ 0nnx_ne_mapping_trt_plugin: { ‘abs_O’: tpat_abs_0’) (TensorRTi EOR: 3: getPluginCreator could not find plugin: tpat_abs_0 version: 1 In node 0 (importFaflbackPluginlmporter): UISIJPCRTED_N00E: Assertion failed: creator && Plugin not found, are the plugin ne. version. and namespace Correct?” test_tpat.py:278: DeprecationWarning: Use build_serialized_network instead. engine = builder.buiid_engine(network, builder_config) [TensorRT] EFOR: 4: [network.cpp::validate::2411J Error Code 4: Internal. Error (Network must have at least one output) LERJ engine is None PyOLA ER: The context stack was not enpty upon module cleanup. A context was still active when the context stack was being cleaned up. At this point in our execution. GDA may already have been deinitialized. so there is no way we can finish cleanly. The progral will be aborted now. Use Context.pop() to avoid this problem. @buptqq