[Question]: Building a chatbot with custom data

Avv22 commented 3 weeks ago

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

I want to fine-tune llama on my custome dataset so that is answers all questions related to my data. Can I do that with your framework please? Is it free to use?

dosubot[bot] commented 3 weeks ago

Hey @Avv22! I'm here to help you with any questions or issues you have. Let's tackle this together!

Yes, you can fine-tune LLaMA on your custom dataset using the LlamaIndex framework. The llama-index-finetuning package includes various finetuning engines such as CohereRerankerFinetuneEngine, EmbeddingAdapterFinetuneEngine, GradientFinetuneEngine, OpenAIFinetuneEngine, and SentenceTransformersFinetuneEngine ^[1].

Additionally, the framework is free to use as it is licensed under the MIT license ^[1].

To continue talking to Dosu, mention @dosu.

Avv22 commented 3 weeks ago

Thanks. How about if I want to use RAG instead of fine-tuning, can you help with that please?

dosubot[bot] commented 3 weeks ago

To use Retrieval-Augmented Generation (RAG) instead of fine-tuning LLaMA on your custom dataset, follow these steps:

Setup Documents and Index: Create documents and build a VectorStoreIndex from them.

from llama_index.legacy import Document, VectorStoreIndex
from llama_index.legacy.retrievers import VectorIndexRetriever

# Create documents
documents = [
    Document(text="A group of penguins, known as a 'waddle' on land, shuffled across the Antarctic ice, their tuxedo-like plumage standing out against the snow."),
    Document(text="Emperor penguins, the tallest of all penguin species, can dive deeper than any other bird, reaching depths of over 500 meters."),
    # Add more documents as needed
]

index = VectorStoreIndex.from_documents(documents)

# Setup a simple retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

Download and Setup Self-RAG Model: Download the Self-RAG model and set up the SelfRAGQueryEngine.

# Option: download llama_pack
from llama_index.legacy.llama_pack import download_llama_pack

download_llama_pack(
    "SelfRAGPack",
    "./self_rag_pack",
    skip_load=True,
)
from self_rag_pack.base import SelfRAGQueryEngine

# Download the self-RAG model
download_dir = "/home/mmaatouk/tmp"  # Replace with your directory
!pip3 install -q huggingface-hub
!huggingface-cli download m4r1/selfrag_llama2_7b-GGUF selfrag_llama2_7b.q4_k_m.gguf --local-dir {download_dir} --local-dir-use-symlinks False

from pathlib import Path

model_path = Path(download_dir) / "selfrag_llama2_7b.q4_k_m.gguf"
query_engine = SelfRAGQueryEngine(str(model_path), retriever, verbose=True)

Query the Model: Use the SelfRAGQueryEngine to query the model with or without retrieval.

# No retrieval example
response = query_engine.query("Which genre the book pride and prejudice?")

# Retrieval example
response = query_engine.query("How tall is the smallest penguins?")

For a more performant RAG application, consider techniques like decoupling chunks used for retrieval vs. synthesis, structured retrieval for larger document sets, dynamically retrieving chunks depending on your task, and optimizing context embeddings ^[1]^[2]^[3].

Additionally, you can use the RagCLI tool from LlamaIndex to handle local document ingestion and querying via a command-line interface:

from llama_index.cli.rag.base import RagCLI

# Initialize RagCLI
rag_cli = RagCLI()

# Ingest documents
await rag_cli.handle_cli(files="path/to/your/documents", verbose=True)

# Query the ingested documents
await rag_cli.handle_cli(question="How tall is the smallest penguin?", verbose=True)

This setup allows you to efficiently manage document ingestion and querying, making it easier to implement RAG in your applications ^[4]^[5].

To continue talking to Dosu, mention @dosu.

Avv22 commented 3 weeks ago

which one is better and give better accuracy?

dosubot[bot] commented 3 weeks ago

Using Retrieval-Augmented Generation (RAG) with LlamaIndex is generally more suitable and accurate for answering questions related to your custom dataset compared to fine-tuning LLaMA alone. The RAG approach integrates and retrieves relevant information from diverse data sources, enhancing the accuracy and relevance of the answers provided ^[1]^[2].

To continue talking to Dosu, mention @dosu.

Avv22 commented 2 weeks ago

So, in case I have data in PDF format, then how to extract that to train the LLMs using RAG above through llama_index framework? Any suggestion with data preparation, extract format?

dosubot[bot] commented 2 weeks ago

To extract data from PDFs and prepare it for use with a Retrieval-Augmented Generation (RAG) setup using the LlamaIndex framework, you can use the PDFReader class provided in the LlamaIndex framework. Below is an example of how you can use this class to load and parse PDF files, and then use the AutoMergingRetrieverPack to handle the documents:

from pathlib import Path
from llama_index.readers.file.docs import PDFReader
from llama_index.packs.auto_merging_retriever import AutoMergingRetrieverPack

# Initialize the PDFReader
loader = PDFReader()

# Path to your PDF file
document_path = Path("path/to/your/file.pdf")

# Load data from the PDF file
documents = loader.load_data(file=document_path)

# Create the AutoMergingRetrieverPack with the loaded documents
auto_merging_retriever_pack = AutoMergingRetrieverPack(documents)

# Run the pack with a query
response = auto_merging_retriever_pack.run("Your query here")
print(response)

In this example:

PDFReader is used to load and parse the PDF file.
documents will be a list of Document objects, each containing the extracted text and metadata.
AutoMergingRetrieverPack is used to handle the documents and run queries against them.

This extracted data can then be used in your RAG setup. Note that the PDFReader class relies on the pypdf library to read PDF files, so make sure to install it using pip install pypdf if you haven't already ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

run-llama / llama_index

[Question]: Building a chatbot with custom data #14031

Question Validation

Question