Open gotyer opened 3 months ago
With multiple GPUs, we see "FULL GPU OFFLOAD POSSIBLE" for a lot of models, but when trying to load them, it crashes and gives a memory error.
Please either implement multi GPU offload, or correct your memory calculation, as to not mislead the user.
For now, I will have to use ollama as they support multi GPU offload, but I would rather continue using LM studio for conveniency.
GPUs : 2x RTX 3090 (24Go each) Model : meta-llama/Meta-Llama-3.1-70B-Instruct (IQ3_M, 34.27 GB)
full error:
{ "memory": { "ram_capacity": "63.90 GB", "ram_unused": "51.85 GB" }, "gpu": { "gpu_names": [ "NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 3090" ], "vram_recommended_capacity": "48.00 GB", "vram_unused": "45.55 GB" }, "os": { "platform": "win32", "version": "10.0.22631" }, "app": { "version": "0.2.31", "downloadsDir": "---\lm-studio\models" }, "model": {} }
With multiple GPUs, we see "FULL GPU OFFLOAD POSSIBLE" for a lot of models, but when trying to load them, it crashes and gives a memory error.
Please either implement multi GPU offload, or correct your memory calculation, as to not mislead the user.
For now, I will have to use ollama as they support multi GPU offload, but I would rather continue using LM studio for conveniency.
GPUs : 2x RTX 3090 (24Go each) Model : meta-llama/Meta-Llama-3.1-70B-Instruct (IQ3_M, 34.27 GB)
full error:
{ "memory": { "ram_capacity": "63.90 GB", "ram_unused": "51.85 GB" }, "gpu": { "gpu_names": [ "NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 3090" ], "vram_recommended_capacity": "48.00 GB", "vram_unused": "45.55 GB" }, "os": { "platform": "win32", "version": "10.0.22631" }, "app": { "version": "0.2.31", "downloadsDir": "---\lm-studio\models" }, "model": {} }