simonw / llm-gpt4all

Plugin for LLM adding support for the GPT4All collection of models
Apache License 2.0
194 stars 19 forks source link

Initial prototype #1

Closed simonw closed 12 months ago

simonw commented 12 months ago

Using https://docs.gpt4all.io/gpt4all_python.html

simonw commented 12 months ago

Getting prompts out of the model was very easy:

gpt_model = GPT4All(self.model.filename())
output = gpt_model.generate(prompt.prompt, streaming=True)
# yield from output

What was harder is that the underlying C++ libraries log a LOT of stuff to stdout and stderr, in a way that can't be easily surpressed from Python.

With GPT-4's help I figured out this pattern:

class SuppressOutput:
    def __enter__(self):
        # Save a copy of the current file descriptors for stdout and stderr
        self.stdout_fd = os.dup(1)
        self.stderr_fd = os.dup(2)

        # Open a file to /dev/null
        self.devnull_fd = os.open(os.devnull, os.O_WRONLY)

        # Replace stdout and stderr with /dev/null
        os.dup2(self.devnull_fd, 1)
        os.dup2(self.devnull_fd, 2)

        # Writes to sys.stdout and sys.stderr should still work
        self.original_stdout = sys.stdout
        self.original_stderr = sys.stderr
        sys.stdout = os.fdopen(self.stdout_fd, "w")
        sys.stderr = os.fdopen(self.stderr_fd, "w")

    def __exit__(self, exc_type, exc_val, exc_tb):
        # Restore stdout and stderr to their original state
        os.dup2(self.stdout_fd, 1)
        os.dup2(self.stderr_fd, 2)

        # Close the saved copies of the original stdout and stderr file descriptors
        os.close(self.stdout_fd)
        os.close(self.stderr_fd)

        # Close the file descriptor for /dev/null
        os.close(self.devnull_fd)

        # Restore sys.stdout and sys.stderr
        sys.stdout = self.original_stdout
        sys.stderr = self.original_stderr

Used like this:

    class Response(llm.Response):
        def iter_prompt(self, prompt):
            with SuppressOutput():
                gpt_model = GPT4All(self.model.filename())
                output = gpt_model.generate(prompt.prompt, streaming=True)
                yield from output

I wanted sys.stdout and sys.stderr to continue working when Python code called them mainly to ensure the download progress bar still displayed if the model needed to be downloaded.

simonw commented 12 months ago

OK, it works well enough for a first draft.

An interesting wart is that a lot of these models aren't configured for instructions - instead, the JSON file here https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json includes suggested prompts to get them to respond to a question, e.g.

  {
    "order": "a",
    "md5sum": "4acc146dd43eb02845c233c29289c7c5",
    "name": "Hermes",
    "filename": "nous-hermes-13b.ggmlv3.q4_0.bin",
    "filesize": "8136777088",
    "requires": "2.4.7",
    "ramrequired": "16",
    "parameters": "13 billion",
    "quant": "q4_0",
    "type": "LLaMA",
    "description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
    "url": "https://huggingface.co/TheBloke/Nous-Hermes-13B-GGML/resolve/main/nous-hermes-13b.ggmlv3.q4_0.bin",
    "promptTemplate": "### Instruction:\n%1\n### Response:\n"
  }

I'm not yet doing anything with those, but maybe I should.

simonw commented 12 months ago
llm models list
OpenAI Chat: gpt-3.5-turbo (aliases: 3.5, chatgpt)
OpenAI Chat: gpt-3.5-turbo-16k (aliases: chatgpt-16k, 3.5-16k)
OpenAI Chat: gpt-4 (aliases: 4, gpt4)
OpenAI Chat: gpt-4-32k (aliases: 4-32k)
gpt4all: orca-mini-3b - Orca (Small), 1.80GB on disk, needs 4GB RAM (installed)
gpt4all: ggml-gpt4all-j-v1 - Groovy, 3.53GB on disk, needs 8GB RAM (installed)
gpt4all: orca-mini-7b - Orca, 3.53GB on disk, needs 8GB RAM (installed)
gpt4all: nous-hermes-13b - Hermes, 7.58GB on disk, needs 16GB RAM (installed)
gpt4all: ggml-model-gpt4all-falcon-q4_0 - GPT4All Falcon, 3.78GB on disk, needs 8GB RAM
gpt4all: ggml-vicuna-7b-1 - Vicuna, 3.92GB on disk, needs 8GB RAM
gpt4all: ggml-wizardLM-7B - Wizard, 3.92GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-base - MPT Base, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-instruct - MPT Instruct, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-mpt-7b-chat - MPT Chat, 4.52GB on disk, needs 8GB RAM
gpt4all: ggml-replit-code-v1-3b - Replit, 4.84GB on disk, needs 4GB RAM
gpt4all: orca-mini-13b - Orca (Large), 6.82GB on disk, needs 16GB RAM
gpt4all: GPT4All-13B-snoozy - Snoozy, 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-vicuna-13b-1 - Vicuna (large), 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-nous-gpt4-vicuna-13b - Nous Vicuna, 7.58GB on disk, needs 16GB RAM
gpt4all: ggml-stable-vicuna-13B - Stable Vicuna, 7.58GB on disk, needs 16GB RAM
gpt4all: wizardLM-13B-Uncensored - Wizard Uncensored, 7.58GB on disk, needs 16GB RAM
simonw commented 12 months ago

I haven't set up aliases for these yet. I should probably include aliases for a specific list of the more popular models, once I figure out what those are.

Here's how the installation detection code works: https://github.com/simonw/llm-gpt4all/blob/1cb087bafb48f43ae6c1952f76b64287b494892c/llm_gpt4all.py#L70-L77

simonw commented 12 months ago

My code fetches https://gpt4all.io/models/models.json at most once an hour:

https://github.com/simonw/llm-gpt4all/blob/1cb087bafb48f43ae6c1952f76b64287b494892c/llm_gpt4all.py#L27-L31

simonw commented 12 months ago
llm -m 'ggml-replit-code-v1-3b' 'A python function that donwloads a JSON file and saves it to disk, but only if it has not yet been saved'
58%|██████████████████████████████████████████████████████████████████████▉                                                    | 3.00G/5.20G [01:17<00:42, 51.5MiB/s]
    """

    from urllib2_ssl3  import buildUrlOpenerWithTimeout as openURLwithSSLtimeout; timeout = 10*60#10 minutes for the download of this data set! This is a very large file and will take some time to complete, so it's worthwhile having an extra second or two.
    url_base="https://www2.census.gov/programmes-surveying"

    from urllib2  import buildUrlOpenerWithTimeout as openURLwithSSLtimeout; timeout = 10*60#10 minutes for the download of this data set! This is a very large file and will take some time to complete, so it's worthwhile having an extra second or two.
    url_base="https://www2.census.gov/programmes-surveying"

    from urllib2  import buildUrlOpenerWithTimeout as openURLwithSSLggml_metal_free: deallocating
simonw commented 12 months ago

That "ggml_metal_free: deallocating" note at the end seems to be some C++ logging code that escaped from my SuppressOutput() mechanism. Maybe it happens when Python objects themselves are deallocated outside of that block?

It does at least go to stderr so > code.py won't capture it.

simonw commented 12 months ago

Best result so far:

llm -m 'ggml-vicuna-13b-1' 'Ten fun names for a pet pelican'
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.14G/8.14G [03:20<00:00, 40.6MiB/s]
, ten fun facts about pelicans and ten fun activities related to pelicans.

Pelicans are fascinating birds with unique characteristics that make them stand out from other bird species. Here are some fun names for a pelican:

1. Mr. P
2. Mrs. Pel
3. The Feathered Friend
4. Big Bird
5. Fluffy McFluffster
6. Captain Crunch
7. Nemo the Pelican
8. The Great White Pelican
9. The Pink Panther Pelican
10. The Rainbow Warrior Pelican

Here are ten fun facts about pelicans:

1. Pelicans have a distinctive pouch under their bills that they use to catch fish.
2. Pelicans can be found in many parts of the world, including North and South America, Europe, Africa, and Asia.
3. The largest species of pelican is the White Pelican, which can have a wingspan of up to 9 feet.
4. Pelicans are known for their unique behavior of diving into the water to catch fish.
5. Pelicans are social birds that often live in colonies with other bird species.
6. The Brown Pelican is one of the most iconic symbols of the United States, appearing on many state flags and seals.
7. Pelicans have a long lifespan, with some species living up to 20 years or more in captivity.
8. Despite their large size, pelicans are relatively lightweight due to their hollow bones and pneumatic muscles.
9. The Peruvian Pelican is one of the rarest species of pelican, with only a few thousand remaining in the wild.
10. Pelicans have been known to form close bonds with humans, particularly in captive settings where they are hand