run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.26k stars 4.65k forks source link

[Question]: Building a chatbot with custom data #14031

Open Avv22 opened 3 weeks ago

Avv22 commented 3 weeks ago

Question Validation

Question

I want to fine-tune llama on my custome dataset so that is answers all questions related to my data. Can I do that with your framework please? Is it free to use?

dosubot[bot] commented 3 weeks ago

Hey @Avv22! I'm here to help you with any questions or issues you have. Let's tackle this together!

Yes, you can fine-tune LLaMA on your custom dataset using the LlamaIndex framework. The llama-index-finetuning package includes various finetuning engines such as CohereRerankerFinetuneEngine, EmbeddingAdapterFinetuneEngine, GradientFinetuneEngine, OpenAIFinetuneEngine, and SentenceTransformersFinetuneEngine [1].

Additionally, the framework is free to use as it is licensed under the MIT license [1].

To continue talking to Dosu, mention @dosu.

Avv22 commented 3 weeks ago

Thanks. How about if I want to use RAG instead of fine-tuning, can you help with that please?

dosubot[bot] commented 3 weeks ago

To use Retrieval-Augmented Generation (RAG) instead of fine-tuning LLaMA on your custom dataset, follow these steps:

  1. Setup Documents and Index: Create documents and build a VectorStoreIndex from them.

    from llama_index.legacy import Document, VectorStoreIndex
    from llama_index.legacy.retrievers import VectorIndexRetriever
    
    # Create documents
    documents = [
        Document(text="A group of penguins, known as a 'waddle' on land, shuffled across the Antarctic ice, their tuxedo-like plumage standing out against the snow."),
        Document(text="Emperor penguins, the tallest of all penguin species, can dive deeper than any other bird, reaching depths of over 500 meters."),
        # Add more documents as needed
    ]
    
    index = VectorStoreIndex.from_documents(documents)
    
    # Setup a simple retriever
    retriever = VectorIndexRetriever(
        index=index,
        similarity_top_k=10,
    )
  2. Download and Setup Self-RAG Model: Download the Self-RAG model and set up the SelfRAGQueryEngine.

    # Option: download llama_pack
    from llama_index.legacy.llama_pack import download_llama_pack
    
    download_llama_pack(
        "SelfRAGPack",
        "./self_rag_pack",
        skip_load=True,
    )
    from self_rag_pack.base import SelfRAGQueryEngine
    
    # Download the self-RAG model
    download_dir = "/home/mmaatouk/tmp"  # Replace with your directory
    !pip3 install -q huggingface-hub
    !huggingface-cli download m4r1/selfrag_llama2_7b-GGUF selfrag_llama2_7b.q4_k_m.gguf --local-dir {download_dir} --local-dir-use-symlinks False
    
    from pathlib import Path
    
    model_path = Path(download_dir) / "selfrag_llama2_7b.q4_k_m.gguf"
    query_engine = SelfRAGQueryEngine(str(model_path), retriever, verbose=True)
  3. Query the Model: Use the SelfRAGQueryEngine to query the model with or without retrieval.

    # No retrieval example
    response = query_engine.query("Which genre the book pride and prejudice?")
    
    # Retrieval example
    response = query_engine.query("How tall is the smallest penguins?")

For a more performant RAG application, consider techniques like decoupling chunks used for retrieval vs. synthesis, structured retrieval for larger document sets, dynamically retrieving chunks depending on your task, and optimizing context embeddings [1][2][3].

Additionally, you can use the RagCLI tool from LlamaIndex to handle local document ingestion and querying via a command-line interface:

from llama_index.cli.rag.base import RagCLI

# Initialize RagCLI
rag_cli = RagCLI()

# Ingest documents
await rag_cli.handle_cli(files="path/to/your/documents", verbose=True)

# Query the ingested documents
await rag_cli.handle_cli(question="How tall is the smallest penguin?", verbose=True)

This setup allows you to efficiently manage document ingestion and querying, making it easier to implement RAG in your applications [4][5].

To continue talking to Dosu, mention @dosu.

Avv22 commented 3 weeks ago

which one is better and give better accuracy?

dosubot[bot] commented 3 weeks ago

Using Retrieval-Augmented Generation (RAG) with LlamaIndex is generally more suitable and accurate for answering questions related to your custom dataset compared to fine-tuning LLaMA alone. The RAG approach integrates and retrieves relevant information from diverse data sources, enhancing the accuracy and relevance of the answers provided [1][2].

To continue talking to Dosu, mention @dosu.

Avv22 commented 2 weeks ago

So, in case I have data in PDF format, then how to extract that to train the LLMs using RAG above through llama_index framework? Any suggestion with data preparation, extract format?

dosubot[bot] commented 2 weeks ago

To extract data from PDFs and prepare it for use with a Retrieval-Augmented Generation (RAG) setup using the LlamaIndex framework, you can use the PDFReader class provided in the LlamaIndex framework. Below is an example of how you can use this class to load and parse PDF files, and then use the AutoMergingRetrieverPack to handle the documents:

from pathlib import Path
from llama_index.readers.file.docs import PDFReader
from llama_index.packs.auto_merging_retriever import AutoMergingRetrieverPack

# Initialize the PDFReader
loader = PDFReader()

# Path to your PDF file
document_path = Path("path/to/your/file.pdf")

# Load data from the PDF file
documents = loader.load_data(file=document_path)

# Create the AutoMergingRetrieverPack with the loaded documents
auto_merging_retriever_pack = AutoMergingRetrieverPack(documents)

# Run the pack with a query
response = auto_merging_retriever_pack.run("Your query here")
print(response)

In this example:

This extracted data can then be used in your RAG setup. Note that the PDFReader class relies on the pypdf library to read PDF files, so make sure to install it using pip install pypdf if you haven't already [1][2][3][4].

To continue talking to Dosu, mention @dosu.