edgar971 / open-chat

A self-hosted, offline, ChatGPT-like chatbot with different LLM support. 100% private, with no data leaving your device.
MIT License
66 stars 8 forks source link

Running custom text generation models #6

Closed corndog2000 closed 1 year ago

corndog2000 commented 1 year ago

Hi, is there a way to run a custom model that is available on huggingface? I tried adding the download link in the docker setup, and it gave an error when running.

/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
  warnings.warn(
llama.cpp: loading model from /models/gpt-j-6b
error loading model: unknown (magic, version) combination: 04034b50, 08080000; is this really a GGML file?
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/__main__.py", line 46, in <module>
    app = create_app(settings=settings)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 317, in create_app
    llama = llama_cpp.Llama(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 328, in __init__
    assert self.model is not None
AssertionError
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
  warnings.warn(
llama.cpp: loading model from /models/gpt-j-6b
error loading model: unknown (magic, version) combination: 04034b50, 08080000; is this really a GGML file?
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/__main__.py", line 46, in <module>
    app = create_app(settings=settings)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 317, in create_app
    llama = llama_cpp.Llama(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 328, in __init__
    assert self.model is not None
AssertionError
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
  warnings.warn(
llama.cpp: loading model from /models/gpt-j-6b
error loading model: unknown (magic, version) combination: 04034b50, 08080000; is this really a GGML file?
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/__main__.py", line 46, in <module>
    app = create_app(settings=settings)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 317, in create_app
    llama = llama_cpp.Llama(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 328, in __init__
    assert self.model is not None
AssertionError
Initializing server with:
Batch size: 2096
Number of CPU threads: 8
Context window: 4096

> ai-chatbot-starter@0.1.0 start
> next start

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
/models/gpt-j-6b model found.
Initializing server with:
Batch size: 2096
Number of CPU threads: 8
Context window: 4096

> ai-chatbot-starter@0.1.0 start
> next start

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
/models/gpt-j-6b model found.
Initializing server with:
Batch size: 2096
Number of CPU threads: 8
Context window: 4096

> ai-chatbot-starter@0.1.0 start
> next start

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
corndog2000 commented 1 year ago

Here is my container config. image

I used this as the Model Download URL: https://huggingface.co/EleutherAI/gpt-j-6b/resolve/main/pytorch_model.bin Here is the huggingface.co URL: https://huggingface.co/EleutherAI/gpt-j-6b

edgar971 commented 1 year ago

Hi, we are using llama.cpp which currently supports GGML versions of the model. https://github.com/abetlen/llama-cpp-python. Is there a GGML version of that model?