Allowing setting the device for the model (i.e. gpu)

In gpt4all, when instantiating your model, you can pass a device= param, as laid out in the python bindings readme: https://github.com/nomic-ai/gpt4all/tree/6f38fde80b2a604fa4678779547921e9be48b092/gpt4all-bindings/python. Specifically, they say:

from gpt4all import GPT4All
model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf", device='gpu') # device='amd', device='intel'
output = model.generate("The capital of France is ", max_tokens=3)
print(output)

I don't see anything in llm-gpt4all to pass this along. Being able to would be helpful.

As a short test-case for myself, I did directly edit llm_gpt4all.py to just force in passing this (line 166, just adding device='gpu'), and it seemed to work (it ran the same prompt as I had been doing in ~1/3 of the time and my gpu usage cranked up to 97%). I am running on a linux system with an Nvidia RTX 3070, if that matters.

I'm not sure what the best way to pass this param in would be. At first I thought adding it to the options on the cli (like llm --options device gpu blahblahblah), but it looks like options is focused on what things can be tweaked with the model that are passed to gpt4all's generate method, not instantiating the object in the first place. Not sure if there is anywhere better, though. I haven't looked at how llm passes things to llm_gpt4all, either, so options is probably the best way already to pass it from there to here.

simonw / llm-gpt4all

Allowing setting the device for the model (i.e. gpu) #32