Can we support two models each occupying a separate GPU independently?

I attempted to make two models occupy two different GPUs independently, but it failed (an illegal memory access was encountered. CUDA kernel errors might be asynchronously reported at some other API call, so the stack trace below might be incorrect). I want to know whether this issue is due to a problem with my configuration (as shown below) or if the current approach doesn't support this setup. Thanks very much.

hao-ai-lab / MuxServe

Can we support two models each occupying a separate GPU independently? #3