mrseanryan / gpt-diff

Use LLM to describe a difference graph between versions of a compositional document
MIT License
0 stars 0 forks source link

Try host local Mistral-7B-instruct #2

Open mrseanryan opened 11 months ago

mrseanryan commented 11 months ago

Quantized versions are here:

https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF

Q4_K_M

https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/blob/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf

from ctransformers import AutoModelForCausalLM

# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.1-GGUF", model_file="mistral-7b-instruct-v0.1.Q4_K_M.gguf", model_type="mistral", gpu_layers=50)

print(llm("AI is going to"))

see https://github.com/mrseanryan/gpt-workflow/tree/master/local-llm-q

mrseanryan commented 11 months ago

Works well enough (see local-llm-q folder) ...

mrseanryan commented 11 months ago

Possibly setting Temp = 1 (not 0.5) adds more details to the DOT output ....

Higher temp does help with the DOT -> natural language inference (more text is generated).