[ Y] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
I expected the toy embedding example to "work"
Current Behavior
Please provide a detailed written description of what llama.cpp did, instead.
ggml_allocr_alloc: not enough space in the buffer (needed 442368, largest block available 290848)
GGML_ASSERT: C:\Users\jason\AppData\Local\Temp\pip-install-4x0xr_93\llama-cpp-python_fec9a526add744f5b2436cab2e2c4c28\vendor\llama.cpp\ggml-alloc.c:173: !"not enough space in the buffer"
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
import os
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.embeddings import LlamaCppEmbeddings
from langchain.vectorstores import DeepLake
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.llms import LlamaCpp
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# instantiate the LLM and embeddings models
llm = LlamaCpp(model_path="llama-2-13b-chat.Q5_K_M.gguf",
temperature=0,
max_tokens=1000,
top_p=1,
Verbose=True)
embeddings = LlamaCppEmbeddings(model_path="llama-2-13b-chat.Q5_K_M.gguf")
# create our documents
texts = [
"Napoleon Bonaparte was born in 15 August 1769",
"Louis XIV was born in 5 September 1638"
]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.create_documents(texts)
# create Deep Lake dataset
# TODO: use your organization id here. (by default, org id is your username)
my_activeloop_org_id = "<SOME_ID>"
my_activeloop_dataset_name = "langchain_llama_00"
dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}"
db = DeepLake(dataset_path=dataset_path, embedding_function=embeddings)
# add documents to our Deep Lake dataset
db.add_documents(docs)
retrieval_qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=db.as_retriever())
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
tools = [
Tool(
name="Retrieval QA System",
func=retrieval_qa.run,
description="Useful for answering questions."
),
]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
response = agent.run("When was Napoleone born?")
print(response)
Physical (or virtual) hardware you are using, e.g. for Linux:
win11, AMD Ryzen 9 6900HS CPU/ RX6800S GPU, 32gb ram
Operating System, e.g. for Linux:
$ python3 --version 3.11.5 (Anaconda+SHARK venv under Wi11)
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
This MAY be a bug, or it maybe newbie fail (apologies if this turns out to be the latter)
Please include any relevant log snippets or files. If it works under one configuration but not under another, please provide logs for both configurations and their corresponding outputs so it is easy to see where behavior changes.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I expected the toy embedding example to "work"
Current Behavior
Please provide a detailed written description of what
llama.cpp
did, instead.ggml_allocr_alloc: not enough space in the buffer (needed 442368, largest block available 290848) GGML_ASSERT: C:\Users\jason\AppData\Local\Temp\pip-install-4x0xr_93\llama-cpp-python_fec9a526add744f5b2436cab2e2c4c28\vendor\llama.cpp\ggml-alloc.c:173: !"not enough space in the buffer"
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
I tried adopting the first toy vector db embedding example from https://learn.activeloop.ai/courses/take/langchain/multimedia/46317643-langchain-101-from-zero-to-hero here is the code:
Physical (or virtual) hardware you are using, e.g. for Linux: win11, AMD Ryzen 9 6900HS CPU/ RX6800S GPU, 32gb ram
Operating System, e.g. for Linux:
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
This MAY be a bug, or it maybe newbie fail (apologies if this turns out to be the latter)
Steps to Reproduce
I took https://learn.activeloop.ai/courses/take/langchain/multimedia/46317643-langchain-101-from-zero-to-hero and modified the "napoleon" example for simple vector embeddings and tried plugging in llama-cpp-python
Failure Logs
Please include any relevant log snippets or files. If it works under one configuration but not under another, please provide logs for both configurations and their corresponding outputs so it is easy to see where behavior changes.
Here is a sample run from running "main.exe"
Please close your issue when it has been answered.