Closed wrx1990 closed 2 months ago
I think that there are two possible reasons, if you are on a GPU, it is going OOM, the model should use ~6GB of memory for inference. While if you are on CPU, the model is not meant to work on CPU and it will crash (due to triton usage).
I think that there are two possible reasons, if you are on a GPU, it is going OOM, the model should use ~6GB of memory for inference. While if you are on CPU, the model is not meant to work on CPU and it will crash (due to triton usage).
The problem is that my cuda version is lower than the required version。
When I run the demo code, when
predictions = model.infer(rgb)
is executed, my notebook will suddenly disconnect.how to solved it!