Where is the llm model used?

mr250624 commented 5 months ago

hi. Sorry for the stupid Q but I can see you access the llama model via http://localhost:11434 and you have parameters for model_path, model, batch_size etc but I dont see any code that uses those model parameters. Do you run up the llm separately?

nydasco commented 5 months ago

hi. Sorry for the stupid Q but I can see you access the llama model via http://localhost:11434 and you have parameters for model_path, model, batch_size etc but I dont see any code that uses those model parameters. Do you run up the llm separately?

Not a stupid question at all.

I did make note of it in the article I wrote, but probably needed to call it out a bit more clearly and flag it in the repo. I'll update the README.md. I’m running Ollama locally. This is what’s actually ’running’ the LLM that I use in the code.

mr250624 commented 5 months ago

thanks Andy! OK, I have the ollama model running locally now and I can interact with something like ollama run llama3 "how great is github?" and that responds instantly. The RAG part seems to be working well, I put a pdf into the to_process folder and thats been processing. All good so far :) I've run backend.py and npm start, the web front end pops up on localhost:3000. When I type a message I can see an HTTP POST but I never get any response.....any idea what I've setup wrong? thanks again. Here is the logging with debug enabled and a few extra prints I added that really don't help much :(

python3 backend.py 

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): huggingface.co:443
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/modules.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/config_sentence_transformers.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/README.md HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/modules.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/sentence_bert_config.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer_config.json HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "GET /api/models/sentence-transformers/all-MiniLM-L6-v2/revision/main HTTP/1.1" 200 6069
INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cuda
INFO:chromadb.telemetry.product.posthog:Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
DEBUG:chromadb.config:Starting component System
DEBUG:chromadb.config:Starting component Posthog
DEBUG:chromadb.config:Starting component OpenTelemetryClient
DEBUG:chromadb.config:Starting component SqliteDB
DEBUG:chromadb.config:Starting component QuotaEnforcer
DEBUG:chromadb.config:Starting component LocalSegmentManager
DEBUG:chromadb.config:Starting component SegmentAPI
INFO:chromadb.api.segment:Collection langchain is not created.
should have a prompt
input_variables=['chat_history', 'context', 'input'] input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an assistant for question-answering tasks.         You are named 'NydasBot'. Use the following pieces of retrieved context to         answer the question. If you don't know the answer, just say that you don't know.         Use three sentences maximum and keep the answer concise.\n\n{context}")), MessagesPlaceholder(variable_name='chat_history'), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))]
DEBUG:asyncio:Using selector: EpollSelector
INFO:websockets.server:server listening on 127.0.0.1:8765
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): us-api.i.posthog.com:443
DEBUG:urllib3.connectionpool:https://us-api.i.posthog.com:443 "POST /batch/ HTTP/1.1" 200 None

(Then I run nom start.....)

DEBUG:websockets.server:= connection is CONNECTING
DEBUG:websockets.server:< GET / HTTP/1.1
DEBUG:websockets.server:< Host: localhost:8765
DEBUG:websockets.server:< Connection: Upgrade
DEBUG:websockets.server:< Pragma: no-cache
DEBUG:websockets.server:< Cache-Control: no-cache
DEBUG:websockets.server:< User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36
DEBUG:websockets.server:< Upgrade: websocket
DEBUG:websockets.server:< Origin: http://localhost:3000
DEBUG:websockets.server:< Sec-WebSocket-Version: 13
DEBUG:websockets.server:< Accept-Encoding: gzip, deflate, br, zstd
DEBUG:websockets.server:< Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
DEBUG:websockets.server:< Sec-WebSocket-Key: 7TvGKO/I2/v9BTAqtQ85mw==
DEBUG:websockets.server:< Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
DEBUG:websockets.server:> HTTP/1.1 101 Switching Protocols
DEBUG:websockets.server:> Upgrade: websocket
DEBUG:websockets.server:> Connection: Upgrade
DEBUG:websockets.server:> Sec-WebSocket-Accept: gFMwDD/7/HNP0WPfCD+VMigylSc=
DEBUG:websockets.server:> Sec-WebSocket-Extensions: permessage-deflate; server_max_window_bits=12; client_max_window_bits=12
DEBUG:websockets.server:> Date: Fri, 24 May 2024 10:36:09 GMT
DEBUG:websockets.server:> Server: Python/3.10 websockets/12.0
INFO:websockets.server:connection open
DEBUG:websockets.server:= connection is OPEN
DEBUG:websockets.server:< CLOSE 1001 (going away) [2 bytes]
DEBUG:websockets.server:= connection is CLOSING
DEBUG:websockets.server:> CLOSE 1001 (going away) [2 bytes]
DEBUG:websockets.server:x half-closing TCP connection
DEBUG:websockets.server:= connection is CLOSED
INFO:websockets.server:connection closed
DEBUG:websockets.server:= connection is CONNECTING
DEBUG:websockets.server:< GET / HTTP/1.1
DEBUG:websockets.server:< Host: localhost:8765
DEBUG:websockets.server:< Connection: Upgrade
DEBUG:websockets.server:< Pragma: no-cache
DEBUG:websockets.server:< Cache-Control: no-cache
DEBUG:websockets.server:< User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36
DEBUG:websockets.server:< Upgrade: websocket
DEBUG:websockets.server:< Origin: http://localhost:3000
DEBUG:websockets.server:< Sec-WebSocket-Version: 13
DEBUG:websockets.server:< Accept-Encoding: gzip, deflate, br, zstd
DEBUG:websockets.server:< Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
DEBUG:websockets.server:< Sec-WebSocket-Key: yiUAXC5bvHBUJGYtgu3p5w==
DEBUG:websockets.server:< Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
DEBUG:websockets.server:> HTTP/1.1 101 Switching Protocols
DEBUG:websockets.server:> Upgrade: websocket
DEBUG:websockets.server:> Connection: Upgrade
DEBUG:websockets.server:> Sec-WebSocket-Accept: aBMxqIjGeHHGrOKayQqZhar4++s=
DEBUG:websockets.server:> Sec-WebSocket-Extensions: permessage-deflate; server_max_window_bits=12; client_max_window_bits=12
DEBUG:websockets.server:> Date: Fri, 24 May 2024 10:36:09 GMT
DEBUG:websockets.server:> Server: Python/3.10 websockets/12.0
INFO:websockets.server:connection open
DEBUG:websockets.server:= connection is OPEN

DEBUG:websockets.server:% sending keepalive ping
DEBUG:websockets.server:> PING 01 81 ea 07 [binary, 4 bytes]
DEBUG:websockets.server:< PONG 01 81 ea 07 [binary, 4 bytes]
DEBUG:websockets.server:% received keepalive pong
DEBUG:websockets.server:% sending keepalive ping
DEBUG:websockets.server:> PING bd 84 ed c0 [binary, 4 bytes]
DEBUG:websockets.server:< PONG bd 84 ed c0 [binary, 4 bytes]
DEBUG:websockets.server:% received keepalive pong

(Now I type a message and hit Return.....)

DEBUG:websockets.server:< TEXT '{"input":"how great is github"}' [31 bytes]
DEBUG:langchain_core.tracers.base:Parent run 8bcb4908-e253-4d3c-9c9a-bb4161b05a46 not found for run bfa236ec-404a-46e5-b63d-8409ebd8cc0e. Treating as a root run.
DEBUG:chromadb.config:Starting component PersistentLocalHnswSegment
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:Resetting dropped connection: us-api.i.posthog.com
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/chat HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://us-api.i.posthog.com:443 "POST /batch/ HTTP/1.1" 200 None

nydasco commented 5 months ago

Are you able to connect to the backend using Postman? It would be good to see if we can identify whether it is a frontend or backend issue. If you can connect to the websocket with Postman and get a response, it's an issue with the React page I pulled together.

BTW: I'm Australia based, and it's Friday night. I might not respond right away.

mr250624 commented 5 months ago

hi Andy. Same issues using postman. The websocket seems to connect fine but on a send when the code gets to conversational_rag_chain.invoke I get this debug and but it never returns back from that call.

        return conversational_rag_chain.invoke(
            {"input": input_text},
            config={"configurable": {"session_id": session_id}}
        )["answer"]

DEBUG:langchain_core.tracers.base:Parent run 8bcb4908-e253-4d3c-9c9a-bb4161b05a46 not found for run bfa236ec-404a-46e5-b63d-8409ebd8cc0e. Treating as a root run.
DEBUG:chromadb.config:Starting component PersistentLocalHnswSegment
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:Resetting dropped connection: us-api.i.posthog.com
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/chat HTTP/1.1" 200 None
(pause of less than 0.5 secs)
DEBUG:urllib3.connectionpool:https://us-api.i.posthog.com:443 "POST /batch/ HTTP/1.1" 200 None

Do I need a LangChain API? I created a SmithChain account and set my API key using

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."

and in backend.py I added

import getpass
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

I see more in the debug now but not sure its helpful:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.smith.langchain.com:443
DEBUG:langchain_core.tracers.base:Parent run 99d42217-74b9-41c6-8416-eab9c0075d61 not found for run fa304ebd-2607-4a41-8b9b-61212d5521d4. Treating as a root run.
DEBUG:chromadb.config:Starting component PersistentLocalHnswSegment
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:https://api.smith.langchain.com:443 "GET /info HTTP/1.1" 200 209
DEBUG:urllib3.connectionpool:https://api.smith.langchain.com:443 "POST /runs/batch HTTP/1.1" 403 22
WARNING:langsmith.client:Failed to batch ingest runs: LangSmithError('Failed to POST https://api.smith.langchain.com/runs/batch in LangSmith API. HTTPError(\'403 Client Error: Forbidden for url: https://api.smith.langchain.com/runs/batch\', \'{"detail":"Forbidden"}\')')
DEBUG:urllib3.connectionpool:https://us-api.i.posthog.com:443 "POST /batch/ HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/chat HTTP/1.1" 200 None

I'll keep playing

nydasco / rag_based_chatbot

Where is the llm model used? #1