NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
The error is:
[10/09/2024-06:22:33] [TRT] [E] IExecutionContext::enqueueV3: Error Code 3: API Usage Error (Parameter check failed, condition: mContext.profileObliviousBindings.at(profileObliviousIndex) != nullptr. Address is not set for input tensor host_runtime_perf_knobs. Call setInputTensorAddress or setTensorAddress before enqueue/execute.)
I'm trying to write a unit test for flash attention using version 0.14.0.dev2024100100.
I noticed that
host_runtime_perf_knobs
is a new feature in recent versions. Here are how I use it and the reported error code:` with tensorrt_llm.net_guard(net):
`
attention_params=AttentionParams( sequence_length=sequence_length_tensor, context_lengths=context_lengths_tensor, host_request_types=host_request_types_tensor, max_context_length=context_length, host_context_lengths=host_context_lengths_tensor, host_runtime_perf_knobs=runtime_perf_knobs)
The error is:
[10/09/2024-06:22:33] [TRT] [E] IExecutionContext::enqueueV3: Error Code 3: API Usage Error (Parameter check failed, condition: mContext.profileObliviousBindings.at(profileObliviousIndex) != nullptr. Address is not set for input tensor host_runtime_perf_knobs. Call setInputTensorAddress or setTensorAddress before enqueue/execute.)
Any ideas why?