Closed simonw closed 1 year ago
Getting prompts out of the model was very easy:
gpt_model = GPT4All(self.model.filename())
output = gpt_model.generate(prompt.prompt, streaming=True)
# yield from output
What was harder is that the underlying C++ libraries log a LOT of stuff to stdout and stderr, in a way that can't be easily surpressed from Python.
With GPT-4's help I figured out this pattern:
class SuppressOutput:
def __enter__(self):
# Save a copy of the current file descriptors for stdout and stderr
self.stdout_fd = os.dup(1)
self.stderr_fd = os.dup(2)
# Open a file to /dev/null
self.devnull_fd = os.open(os.devnull, os.O_WRONLY)
# Replace stdout and stderr with /dev/null
os.dup2(self.devnull_fd, 1)
os.dup2(self.devnull_fd, 2)
# Writes to sys.stdout and sys.stderr should still work
self.original_stdout = sys.stdout
self.original_stderr = sys.stderr
sys.stdout = os.fdopen(self.stdout_fd, "w")
sys.stderr = os.fdopen(self.stderr_fd, "w")
def __exit__(self, exc_type, exc_val, exc_tb):
# Restore stdout and stderr to their original state
os.dup2(self.stdout_fd, 1)
os.dup2(self.stderr_fd, 2)
# Close the saved copies of the original stdout and stderr file descriptors
os.close(self.stdout_fd)
os.close(self.stderr_fd)
# Close the file descriptor for /dev/null
os.close(self.devnull_fd)
# Restore sys.stdout and sys.stderr
sys.stdout = self.original_stdout
sys.stderr = self.original_stderr
Used like this:
class Response(llm.Response):
def iter_prompt(self, prompt):
with SuppressOutput():
gpt_model = GPT4All(self.model.filename())
output = gpt_model.generate(prompt.prompt, streaming=True)
yield from output
I wanted sys.stdout
and sys.stderr
to continue working when Python code called them mainly to ensure the download progress bar still displayed if the model needed to be downloaded.
OK, it works well enough for a first draft.
An interesting wart is that a lot of these models aren't configured for instructions - instead, the JSON file here https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json includes suggested prompts to get them to respond to a question, e.g.
{
"order": "a",
"md5sum": "4acc146dd43eb02845c233c29289c7c5",
"name": "Hermes",
"filename": "nous-hermes-13b.ggmlv3.q4_0.bin",
"filesize": "8136777088",
"requires": "2.4.7",
"ramrequired": "16",
"parameters": "13 billion",
"quant": "q4_0",
"type": "LLaMA",
"description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
"url": "https://huggingface.co/TheBloke/Nous-Hermes-13B-GGML/resolve/main/nous-hermes-13b.ggmlv3.q4_0.bin",
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
}
I'm not yet doing anything with those, but maybe I should.
llm models list
OpenAI Chat: gpt-3.5-turbo (aliases: 3.5, chatgpt)
OpenAI Chat: gpt-3.5-turbo-16k (aliases: chatgpt-16k, 3.5-16k)
OpenAI Chat: gpt-4 (aliases: 4, gpt4)
OpenAI Chat: gpt-4-32k (aliases: 4-32k)
gpt4all: orca-mini-3b - Orca (Small), 1.80GB on disk, needs 4GB RAM (installed)
gpt4all: ggml-gpt4all-j-v1 - Groovy, 3.53GB on disk, needs 8GB RAM (installed)
gpt4all: orca-mini-7b - Orca, 3.53GB on disk, needs 8GB RAM (installed)
gpt4all: nous-hermes-13b - Hermes, 7.58GB on disk, needs 16GB RAM (installed)
gpt4all: ggml-model-gpt4all-falcon-q4_0 - GPT4All Falcon, 3.78GB on disk, needs 8GB RAM
gpt4all: ggml-vicuna-7b-1 - Vicuna, 3.92GB on disk, needs 8GB RAM
gpt4all: ggml-wizardLM-7B - Wizard, 3.92GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-base - MPT Base, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-instruct - MPT Instruct, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-chat - MPT Chat, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-replit-code-v1-3b - Replit, 4.84GB on disk, needs 4GB RAM
gpt4all: orca-mini-13b - Orca (Large), 6.82GB on disk, needs 16GB RAM
gpt4all: GPT4All-13B-snoozy - Snoozy, 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-vicuna-13b-1 - Vicuna (large), 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-nous-gpt4-vicuna-13b - Nous Vicuna, 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-stable-vicuna-13B - Stable Vicuna, 7.58GB on disk, needs 16GB RAM
gpt4all: wizardLM-13B-Uncensored - Wizard Uncensored, 7.58GB on disk, needs 16GB RAM
I haven't set up aliases for these yet. I should probably include aliases for a specific list of the more popular models, once I figure out what those are.
Here's how the installation detection code works: https://github.com/simonw/llm-gpt4all/blob/1cb087bafb48f43ae6c1952f76b64287b494892c/llm_gpt4all.py#L70-L77
My code fetches https://gpt4all.io/models/models.json at most once an hour:
llm -m 'ggml-replit-code-v1-3b' 'A python function that donwloads a JSON file and saves it to disk, but only if it has not yet been saved'
58%|██████████████████████████████████████████████████████████████████████▉ | 3.00G/5.20G [01:17<00:42, 51.5MiB/s]
"""
from urllib2_ssl3 import buildUrlOpenerWithTimeout as openURLwithSSLtimeout; timeout = 10*60#10 minutes for the download of this data set! This is a very large file and will take some time to complete, so it's worthwhile having an extra second or two.
url_base="https://www2.census.gov/programmes-surveying"
from urllib2 import buildUrlOpenerWithTimeout as openURLwithSSLtimeout; timeout = 10*60#10 minutes for the download of this data set! This is a very large file and will take some time to complete, so it's worthwhile having an extra second or two.
url_base="https://www2.census.gov/programmes-surveying"
from urllib2 import buildUrlOpenerWithTimeout as openURLwithSSLggml_metal_free: deallocating
That "ggml_metal_free: deallocating" note at the end seems to be some C++ logging code that escaped from my SuppressOutput()
mechanism. Maybe it happens when Python objects themselves are deallocated outside of that block?
It does at least go to stderr so > code.py
won't capture it.
Best result so far:
llm -m 'ggml-vicuna-13b-1' 'Ten fun names for a pet pelican'
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.14G/8.14G [03:20<00:00, 40.6MiB/s]
, ten fun facts about pelicans and ten fun activities related to pelicans.
Pelicans are fascinating birds with unique characteristics that make them stand out from other bird species. Here are some fun names for a pelican:
1. Mr. P
2. Mrs. Pel
3. The Feathered Friend
4. Big Bird
5. Fluffy McFluffster
6. Captain Crunch
7. Nemo the Pelican
8. The Great White Pelican
9. The Pink Panther Pelican
10. The Rainbow Warrior Pelican
Here are ten fun facts about pelicans:
1. Pelicans have a distinctive pouch under their bills that they use to catch fish.
2. Pelicans can be found in many parts of the world, including North and South America, Europe, Africa, and Asia.
3. The largest species of pelican is the White Pelican, which can have a wingspan of up to 9 feet.
4. Pelicans are known for their unique behavior of diving into the water to catch fish.
5. Pelicans are social birds that often live in colonies with other bird species.
6. The Brown Pelican is one of the most iconic symbols of the United States, appearing on many state flags and seals.
7. Pelicans have a long lifespan, with some species living up to 20 years or more in captivity.
8. Despite their large size, pelicans are relatively lightweight due to their hollow bones and pneumatic muscles.
9. The Peruvian Pelican is one of the rarest species of pelican, with only a few thousand remaining in the wild.
10. Pelicans have been known to form close bonds with humans, particularly in captive settings where they are hand
Using https://docs.gpt4all.io/gpt4all_python.html