Tech-Watt / AI-ChatBot-

6 stars 2 forks source link

Made another version for you to try out that uses ollama llm only #1

Open jamieduk opened 4 months ago

jamieduk commented 4 months ago

here is a version i made and wanted to share with you to use as a ty for your code Updated version https://github.com/jamieduk/AI-ChatBot-Ollama-With-Storage

# https://github.com/jamieduk/AI-ChatBot-Ollama-With-Storage
import os
import time
import re
from dotenv import load_dotenv
from langchain_community.llms import Ollama
from langchain_core.output_parsers import StrOutputParser

# Load environment variables
load_dotenv()

def format_duration(duration):
    hours=duration // 3600
    minutes=(duration % 3600) // 60
    seconds=duration % 60

    if hours > 0:
        return f"{int(hours)} hour{'s' if hours > 1 else ''}, {int(minutes)} minute{'s' if minutes > 1 else ''}, {seconds:.2f} seconds"
    elif minutes > 0:
        return f"{int(minutes)} minute{'s' if minutes > 1 else ''}, {seconds:.2f} seconds"
    else:
        return f"{seconds:.2f} seconds"

def store_embeddings(question, answer, filename="output.txt"):
    # Store question and answer embeddings in the specified file
    with open(filename, "a", encoding="utf-8") as f:
        f.write(f"Q: {question}\nA: {answer}\n")

def find_answer_in_data(question, filename="output.txt"):
    # Check if the question has been asked before by searching in the specified file
    if os.path.exists(filename):
        with open(filename, "r", encoding="utf-8") as f:
            lines=f.readlines()
            for i in range(0, len(lines), 2):
                if lines[i].strip() == f"Q: {question}":
                    return lines[i + 1].strip().replace("A: ", "")
    return None

# Initialize Ollama local language model
print("Initializing Ollama local language model...")
try:
    ollama_llm=Ollama(model='dolphin-llama3:latest')  # Adjust the model name as needed
    print("Ollama local language model initialized.")
except Exception as e:
    print(f"Failed to initialize Ollama local language model: {e}")
    exit(1)

# Get user input
context=input("Enter the context (leave blank for default): ")
question=input("Enter your question: ")

# Check if the question has been asked before
print("Checking if the question exists in data file...")
answer=find_answer_in_data(question)

if answer:
    print("Answer found in data file:", answer)
else:
    # Combine context and question into a single input string
    input_text=f"Context: {context}\nQuestion: {question}"
    print("No answer found in data file.")
    print("Combined input text:", input_text)

    # Invoke the chain with the user's input and measure the duration
    print("Invoking the chain with the user's input...")
    start_time=time.time()  # Record start time
    try:
        # Debugging print statement before invoking
        print("Before invoking Ollama model")

        # Simplified test for invoking the model
        chain_output=ollama_llm.invoke(input=input_text)

        # Debugging print statement after invoking
        print("After invoking Ollama model")

        end_time=time.time()  # Record end time
        duration=end_time - start_time  # Calculate duration

        if not chain_output:
            print("Chain invocation returned no output.")
        else:
            print("Chain invoked successfully.")
            print("Raw output from chain:", chain_output)

    except Exception as e:
        print(f"Failed to invoke the chain: {e}")
        exit(1)

    # Format and print the duration
    print("Query took", format_duration(duration))

    # If there is valid output, store it in the output.txt file and embeddings.txt file
    if chain_output:
        print("Storing the result in output.txt")
        store_embeddings(question, chain_output, "output.txt")
        print("Storing the result in embeddings.txt")
        store_embeddings(question, chain_output, "embeddings.txt")

    # Strip non-alphanumeric characters from output for final response
    alphanumeric_output=re.sub(r'\W+', ' ', chain_output)
    print("Final Response:", alphanumeric_output)

This script initializes an AI-powered chatbot using the Ollama local language model and allows users to interact with it by providing a question and, optionally, a context. The chatbot leverages a pre-defined prompt template to structure the conversation and elicit responses tailored to the user's query. It also supports persistent storage of output in the "output.txt" file and incremental learning through the accumulation of embeddings in the "data.txt" file.

Compared to the original version, this updated script offers several improvements. Firstly, it handles the absence of the "data.txt" file gracefully, avoiding errors when trying to load non-existent files. Additionally, it incorporates error handling to skip processing embeddings if no valid text chunks are found in the document, preventing unnecessary computations. Furthermore, it enhances user interaction by providing prompts for context and question inputs, making the chatbot more user-friendly. Lastly, it introduces a mechanism to store valid output in "output.txt," enabling users to review previous interactions conveniently. Overall, these enhancements improve the robustness, usability, and functionality of the chatbot, making it a more effective tool for providing information and assistance to users.

Tech-Watt commented 4 months ago

Okay

On Thu, May 16, 2024 at 12:53 PM Jay @.***> wrote:

here is a version i made and wanted to share with you to use as a ty for your code

!/usr/bin/env python

coding: utf-8

import os from dotenv import load_dotenv from langchain_community.llms import Ollama from langchain.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_openai.embeddings import OpenAIEmbeddings from langchain_community.vectorstores import FAISS from langchain_core.runnables import RunnablePassthrough, RunnableParallel from langchain_community.document_loaders import TextLoader

Load environment variables

load_dotenv()

Initialize Ollama local language model

print("Initializing Ollama local language model...") ollama_llm=Ollama(model='dolphin-llama3:latest') # llama3 print("Ollama local language model initialized.")

Define prompt template

default_context="No specific context provided." template=(""" You are an AI-powered chatbot designed to provide information and assistance for customers based on the context provided to you only.

Context:{context} Question:{question} """)

Format prompt template

context=input("Enter the context (leave blank for default): ") if not context: context=default_context

question=input("Enter your question: ") prompt=PromptTemplate.from_template(template=template) prompt.format( context=context, question=question )

Load data from text file

text_file="data.txt" if os.path.isfile(text_file): loader=TextLoader(text_file, encoding='utf-8') document=loader.load() else: document=""

Set up runnable process

result=RunnableParallel(context=RunnablePassthrough(), question=RunnablePassthrough()) chain=result | prompt | ollama_llm | StrOutputParser()

Invoke the chain with the user's input

print("Invoking the chain with the user's input...") output=chain.invoke(question) print("Chain invoked.") print("Output:", output)

If there is valid output, store it in the output.txt file

if output: with open("output.txt", "w", encoding="utf-8") as f: f.write(output)

If there are no valid text chunks found in the document, skip processing embeddings

if document:

Split text into chunks

spliter=RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
chunks=spliter.split_documents(document)

# Generate embeddings for text chunks
embeddings=OpenAIEmbeddings()

# Create vector storage from text chunks
vector_storage=FAISS.from_documents(chunks, embeddings)

# Retrieve embeddings for each chunk and append them to the data.txt file
with open("data.txt", "a", encoding="utf-8") as f:
    for chunk in chunks:
        embeddings=embeddings.generate(chunk)
        embeddings_str=" ".join(map(str, embeddings))
        f.write(embeddings_str + "\n")

print("Embeddings stored in data.txt")

— Reply to this email directly, view it on GitHub https://github.com/Tech-Watt/AI-ChatBot-/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2D4USW4USNI5XFOV2CCV5LZCST4TAVCNFSM6AAAAABH2FFCDCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYDAMZSG42TINY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jamieduk commented 4 months ago

it now has a timer aswell so you can benchmark different models etc, feel free to make a yt vid on the new code, have fun! and feel free to checkout my github i have ai code there too.

Tech-Watt commented 4 months ago

Cool, I'll check it out.

On Thu, May 16, 2024 at 1:30 PM Jay @.***> wrote:

it now has a timer aswell so you can benchmark different models etc, feel free to make a yt vid on the new code, have fun! and feel free to checkout my github i have ai code there too.

— Reply to this email directly, view it on GitHub https://github.com/Tech-Watt/AI-ChatBot-/issues/1#issuecomment-2115253465, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2D4USXZZKRTCYZCRV7IQILZCSYFXAVCNFSM6AAAAABH2FFCDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJVGI2TGNBWGU . You are receiving this because you commented.Message ID: @.***>

jamieduk commented 4 months ago

no probs i updated it again aswell.