simonw / llm-gpt4all

Plugin for LLM adding support for the GPT4All collection of models
Apache License 2.0
218 stars 20 forks source link

Upgrade to models3.json to get Llama 3 #29

Open simonw opened 7 months ago

simonw commented 7 months ago

We currently use models2.json: https://github.com/simonw/llm-gpt4all/blob/67079c00fa64cba4f163c4579c2c4aab2c91f45a/llm_gpt4all.py#L44-L49

Looks like they introduced models3.json two months ago: https://github.com/nomic-ai/gpt4all/commit/b8f5c74f40def7622a7e4b5aa86fadb473f39046 in:

And that's here Meta-Llama-3-8B-Instruct.Q4_0.gguf is defined: https://github.com/nomic-ai/gpt4all/commit/0b78b79b1c79998e86ca30e5de86f1980ee6ed9f

simonw commented 7 months ago

After switching to models3.json the llm models command lists these:

gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM (installed)
gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM (installed)
gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM (installed)
gpt4all: mistral-7b-openorca - Mistral OpenOrca, 3.83GB download, needs 8GB RAM (installed)
gpt4all: all-MiniLM-L6-v2 - SBert, 43.82MB download, needs 1GB RAM
gpt4all: nomic-embed-text-v1 - Nomic Embed Text v1, 261.58MB download, needs 1GB RAM
gpt4all: nomic-embed-text-v1 - Nomic Embed Text v1.5, 261.58MB download, needs 1GB RAM
gpt4all: replit-code-v1_5-3b-newbpe-q4_0 - Replit, 1.82GB download, needs 4GB RAM
gpt4all: mpt-7b-chat - MPT Chat, 3.54GB download, needs 8GB RAM
gpt4all: orca-2-7b - Orca 2 (Medium), 3.56GB download, needs 8GB RAM
gpt4all: rift-coder-v0-7b-q4_0 - Rift coder, 3.56GB download, needs 8GB RAM
gpt4all: mpt-7b-chat-newbpe-q4_0 - MPT Chat, 3.64GB download, needs 8GB RAM
gpt4all: em_german_mistral_v01 - EM German Mistral, 3.83GB download, needs 8GB RAM
gpt4all: ghost-7b-v0 - Ghost 7B v0.9.1, 3.83GB download, needs 8GB RAM
gpt4all: Nous-Hermes-2-Mistral-7B-DPO - Nous Hermes 2 Mistral DPO, 3.83GB download, needs 8GB RAM
gpt4all: gpt4all-falcon-newbpe-q4_0 - GPT4All Falcon, 3.92GB download, needs 8GB RAM
gpt4all: Meta-Llama-3-8B-Instruct - Llama 3 Instruct, 4.34GB download, needs 8GB RAM
gpt4all: gpt4all-13b-snoozy-q4_0 - Snoozy, 6.86GB download, needs 16GB RAM
gpt4all: wizardlm-13b-v1 - Wizard v1.2, 6.86GB download, needs 16GB RAM
gpt4all: orca-2-13b - Orca 2 (Full), 6.86GB download, needs 16GB RAM
gpt4all: nous-hermes-llama2-13b - Hermes, 6.86GB download, needs 16GB RAM
gpt4all: starcoder-newbpe-q4_0 - Starcoder, 8.37GB download, needs 4GB RAM

Including Meta-Llama-3-8B-Instruct.

simonw commented 7 months ago

I'll upgrade dependencies to https://pypi.org/project/gpt4all/2.5.1/ (the latest release) too: https://github.com/simonw/llm-gpt4all/blob/67079c00fa64cba4f163c4579c2c4aab2c91f45a/setup.py#L32-L36

simonw commented 7 months ago
llm -m Meta-Llama-3-8B-Instruct 'hello'
100%|████████████████████████████████████████████████████████████████████████████████████████████████| 4.66G/4.66G [14:29<00:00, 5.36MiB/s]
Error: Unable to instantiate model: code=45, Model format not supported (no matching implementation found)
simonw commented 7 months ago

Upgrading the dependency fixed that.

llm -m Meta-Llama-3-8B-Instruct 'hello'

But that output:

&9!&,(>',"&9:9(8"96889%',8&'''#.,#$($:#$$96#$",.8>>9,#>&69(8>##(##(<,"8!6>,$#>.6$9.("(99"#$#!%"6<.9(8::$.6(:('88:&.%&(6#>#>>%'<.&.:,!%#$<(9!":$,$".8%.%"$>.:8'6'#"#!$<:"666$&#.,,'6(8$<(6>><8$'(8(%",,,"

So something isn't right.

simonw commented 7 months ago

All of my other models are broken after the upgrade too.

simonw commented 7 months ago

OK, now they are working. I deleted files from ~/.cache/gpt4all and re-downloaded them but I don't see how that would have affected Llama 3, since that file should not have been downloaded fresh.

jmarrec commented 1 month ago

It seems that both of these are called "Llama-3"

gpt4all: Llama-3 - Llama 3.2 1B Instruct, 737.21MB download, needs 2GB RAM (installed)
gpt4all: Llama-3 - Llama 3.2 3B Instruct, 1.79GB download, needs 4GB RAM (installed)

I wantes to try out the 3B specifically, on the first llm -m "Llama-3" it downloaded the 3B and answered, but on the second try, it downloaded the 1B model. I can't seem to tell it I want jsut the 3B?