nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
69.01k stars 7.57k forks source link

Python Binding: Unnecessary Dependency on active internet connection with allow_download=True #2713

Closed margau closed 1 month ago

margau commented 1 month ago

Bug Report

With allow_download=True, gpt4all needs an internet connection even if the model is already available.

Example Code

Steps to Reproduce

  1. Start gpt4all with a python script (e.g. the example code) and allow_download=True (the default)
  2. Let it download the model
  3. Restart the script later while being offline
  4. gpt4all crashes

Expected Behavior

It should work offline, when the model was downloaded already

Your Environment

Proposed Solution

The model check (https://github.com/nomic-ai/gpt4all/blob/54ed30937f717286083c7071e288393a3c49a769/gpt4all-bindings/python/gpt4all/gpt4all.py#L309) should be only done when the local model is unavailable (https://github.com/nomic-ai/gpt4all/blob/54ed30937f717286083c7071e288393a3c49a769/gpt4all-bindings/python/gpt4all/gpt4all.py#L338) to allow using cached models without internet connectivity.

cosmic-snow commented 1 month ago

This is not a bug.

The constructor parameter allow_download is not only for retrieving models, but also for the metadata in models3.json to populate session templates. This was previously mentioned in the old documentation and you can now find that explanation in the wiki.

Does that resolve the issue for you?

margau commented 1 month ago

Solved, looked into https://docs.gpt4all.io/gpt4all_python/ref.html#gpt4all.gpt4all.GPT4All.__init__ and did not see the behavior in the "allow_download" description there.

cosmic-snow commented 1 month ago

No worries, I had a hunch it just wasn't clear.