During loop processing tensor model inference, in the middle of the loop will appear “pycuda._driver.LogicError: cuMemHostAlloc failed: OS call failed or operation not supported on this OSXXX failure of TensorRT X.Y when running XXX on GPU XXX” when host_mem = cuda.pagelocked_empty(size, dtype)
I don't know how to think about this issue without a reproducer. Also, given that the error mentions TensorRT (which PyCUDA doesn't interact with), there's a good chance it's unrelated.
During loop processing tensor model inference, in the middle of the loop will appear “pycuda._driver.LogicError: cuMemHostAlloc failed: OS call failed or operation not supported on this OSXXX failure of TensorRT X.Y when running XXX on GPU XXX” when host_mem = cuda.pagelocked_empty(size, dtype)
Environment TensorRT Version: 8.6.1
NVIDIA GPU: T4
NVIDIA Driver Version: 515.65.01
CUDA Version:11.7
CUDNN Version:8.2.1
Operating System: CentOS