Open llmlocal opened 1 month ago
I appreciate the team's hard work and understand the design decision to separate the 4 models onto separate GPU's. For small lab experiments is it possible to leverage a 2 x 4090 configuration?
You could try QLora + 8B models
I appreciate the team's hard work and understand the design decision to separate the 4 models onto separate GPU's. For small lab experiments is it possible to leverage a 2 x 4090 configuration?