PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Apache License 2.0
19.77k stars 2.2k forks source link

Not more than 8 vCPUs ? #545

Open Stef-33560 opened 11 months ago

Stef-33560 commented 11 months ago

Hi,

I'm trying to play with localGPT to understand how LLM are working, within a lxc container on my proxmox server (it's a homelab based on i5-1340P 16vpu/32go ram).

I try several engines to compare performance but no one uses more than 8 vCPU. Even when I use a multilanguage model, the answer is always in english.

Since some of you, as I read, use monsters ;) it's merely not limited to 8. is there a switch somewhere to use more cpu ?

Thanks !

PromtEngineer commented 11 months ago

Change this line: https://github.com/PromtEngineer/localGPT/blob/15e96488b67eb4145d743f5ef4f3cc9e7102bbb7/constants.py#L20

Stef-33560 commented 11 months ago

Thanks :)

But...

Ingest climbs to 75%, sometimes a few more. But parsing prompts never grows upper than 50% of my 16 vCPUs (proxmox, no cuda just CPU)

Os.count() gives 16, not 8, maybe the fault to the models used ? (i.e. default ones)

Any idea ?

davidesba commented 11 months ago

Add this to load_models.py:

        if device_type.lower() == "cpu":
            kwargs["n_threads"] = os.cpu_count()