Open linhcentrio opened 4 months ago
You can use Gemma via Ollama or LM Studio (lm studio provides a server that can stand in for openai, so you can use it with the "openailike" settings-vllm.yaml file ).
If you follow the setup steps for either Ollama or the "openailike" setup for LM Studio (using the local inference server), you can use Gemma.
@ingridstevens can you please help me here trying to gemma via ollama giving me errors:
Traceback (most recent call last):
File "Path\to\project\venv\Lib\site-packages\urllib3\connection.py", line 174, in _new_conn
conn = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\urllib3\util\connection.py", line 95, in create_connection
raise err
File "Path\to\project\venv\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "Path\to\project\venv\Lib\site-packages\urllib3\connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\urllib3\connectionpool.py", line 416, in _make_request
conn.request(method, url, **httplib_request_kw)
File "Path\to\project\venv\Lib\site-packages\urllib3\connection.py", line 244, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "C:\Users\sunrise\AppData\Local\Programs\Python\Python311\Lib\http\client.py", line 1286, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\sunrise\AppData\Local\Programs\Python\Python311\Lib\http\client.py", line 1332, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\sunrise\AppData\Local\Programs\Python\Python311\Lib\http\client.py", line 1281, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\sunrise\AppData\Local\Programs\Python\Python311\Lib\http\client.py", line 1041, in _send_output
self.send(msg)
File "C:\Users\sunrise\AppData\Local\Programs\Python\Python311\Lib\http\client.py", line 979, in send
self.connect()
File "Path\to\project\venv\Lib\site-packages\urllib3\connection.py", line 205, in connect
conn = self._new_conn()
^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\urllib3\connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000001EE1A8065D0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "Path\to\project\venv\Lib\site-packages\requests\adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\urllib3\connectionpool.py", line 799, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001EE1A8065D0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "Path\to\project\venv\Lib\site-packages\gradio\queueing.py", line 495, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\route_utils.py", line 231, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\blocks.py", line 1594, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\blocks.py", line 1188, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\utils.py", line 513, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\utils.py", line 639, in asyncgen_wrapper
response = await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\chat_interface.py", line 487, in _stream_fn
first_response = await async_iteration(generator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\utils.py", line 513, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\utils.py", line 506, in __anext__
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\gradio\utils.py", line 489, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "Path\to\project\private_gpt\ui\ui.py", line 159, in _chat
llm_stream = self._chat_service.stream_chat(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\private_gpt\server\chat\chat_service.py", line 145, in stream_chat
streaming_response = chat_engine.stream_chat(
^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\callbacks\utils.py", line 39, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\chat_engine\simple.py", line 111, in stream_chat
chat_stream=self._llm.stream_chat(all_messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\llms\base.py", line 187, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\llms\ollama.py", line 124, in stream_chat
completion_response = self.stream_complete(prompt, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\llms\base.py", line 313, in wrapped_llm_predict
f_return_val = f(_self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\llama_index\llms\ollama.py", line 146, in stream_complete
response = requests.post(
^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\requests\api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "Path\to\project\venv\Lib\site-packages\requests\adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001EE1A8065D0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
this is my settings.yml:
# The default configuration file.
# More information about configuration can be found in the documentation: https://docs.privategpt.dev/
# Syntax in `private_pgt/settings/settings.py`
server:
env_name: ${APP_ENV:prod}
port: ${PORT:8001}
cors:
enabled: true
allow_origins: ["*"]
allow_methods: ["*"]
allow_headers: ["*"]
auth:
enabled: false
# python -c 'import base64; print("Basic " + base64.b64encode("secret:key".encode()).decode())'
# 'secret' is the username and 'key' is the password for basic auth by default
# If the auth is enabled, this value must be set in the "Authorization" header of the request.
secret: "Basic hello"
key: "moto"
data:
local_data_folder: local_data/private_gpt
ui:
enabled: true
path: /
default_chat_system_prompt: >
You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible and follow ALL given instructions.
Do not speculate or make up information.
Do not reference any given instructions or context.
default_query_system_prompt: >
You can only answer questions about the provided context.
If you know the answer but it is not based in the provided context, don't provide
the answer, just state the answer is not in the context provided.
delete_file_button_enabled: true
delete_all_files_button_enabled: true
llm:
mode: ollama
# Should be matching the selected model
max_new_tokens: 2048
context_window: 8192
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
embedding:
# Should be matching the value above in most cases
mode: local
ingest_mode: simple
vectorstore:
database: qdrant
qdrant:
path: local_data/private_gpt/qdrant
pgvector:
host: localhost
port: 5432
database: postgres
user: postgres
password: postgres
embed_dim: 1024 # 384 is for BAAI/bge-small-en-v1.5 1024 for BAAI/bge-m3
schema_name: private_gpt
table_name: embeddings
local:
prompt_style: "mistral"
# llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF
# llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf
llm_hf_repo_id: lmstudio-ai/gemma-2b-it-GGUF
llm_hf_model_file: gemma-2b-it-q4_k_m.gguf
embedding_hf_model_name: BAAI/bge-m3
# embedding_hf_model_name: BAAI/bge-small-en-v1.5
sagemaker:
llm_endpoint_name: huggingface-pytorch-tgi-inference-2023-09-25-19-53-32-140
embedding_endpoint_name: huggingface-pytorch-inference-2023-11-03-07-41-36-479
openai:
api_key: ${OPENAI_API_KEY:}
model: gpt-3.5-turbo
ollama:
model: gemma:2b
setting pgpt profiles to ollama giving errors
(venv) PS Path\to\project> PGPT_PROFILES=ollama poetry run python -m private_gpt
PGPT_PROFILES=ollama : The term 'PGPT_PROFILES=ollama' is not recognized as the name of a cmdlet, function, script
file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is
correct and try again.
At line:1 char:1
+ PGPT_PROFILES=ollama poetry run python -m private_gpt
+ ~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (PGPT_PROFILES=ollama:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
(venv) PS Path\to\project> set PGPT_PROFILES=ollama poetry run python -m private_gpt
Set-Variable : A positional parameter cannot be found that accepts argument 'run'.
At line:1 char:1
+ set PGPT_PROFILES=ollama poetry run python -m private_gpt
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (:) [Set-Variable], ParameterBindingException
+ FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.SetVariableCommand
@Vivek-C-Shah For using Ollama, I never used or modified the default settings.yaml
file. Instead I created a file for ollama. This is what worked for me on my Mac, I don't know if this will work for you on Windows
You need to create a settings-ollama.yaml
file with the following:
llm:
mode: ollama
ollama:
model: gemma:2b-instruct # Required Model to use.
# Note: Ollama Models are listed here: https://ollama.ai/library
# Be sure to pull the model to your Ollama server
api_base: http://localhost:11434 # Ollama defaults to http://localhost:11434
Then make sure ollama is running with:
ollama run gemma:2b-instruct
poetry install --with local
Also - try setting the PGPT profiles in it's own line:
export PGPT_PROFILES=ollama
and then check that it's set with:
echo $PGPT_PROFILES
and then run:
make run
Okay thanks dude for this information! 🫂
please how to support gemma model