### What is the issue?
When trying to load a model that won't fit in VRAM (16G) on my RX6800XT GPU, it produces an error. This is on Linux, running in the ubuntu rocm docker container.
Here is the…
### Describe the bug
I downloaded the "bge-reranker-v2-minicpm-layerwise" model weights to the server and registered this model (the registered model name is "bge-reranker-v2-minicpm-layerwise-self")…
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.1.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…