hao-ai-lab / MuxServe

45 stars 3 forks source link

Can we support two models each occupying a separate GPU independently? #3

Open Thunderbeee opened 4 hours ago

Thunderbeee commented 4 hours ago

I attempted to make two models occupy two different GPUs independently, but it failed (an illegal memory access was encountered. CUDA kernel errors might be asynchronously reported at some other API call, so the stack trace below might be incorrect). I want to know whether this issue is due to a problem with my configuration (as shown below) or if the current approach doesn't support this setup. Thanks very much.

Screenshot 2024-11-18 at 2 03 12 AM
Thunderbeee commented 4 hours ago
Screenshot 2024-11-18 at 2 05 58 AM