Open Avv22 opened 3 weeks ago
Hey @Avv22! I'm here to help you with any questions or issues you have. Let's tackle this together!
Yes, you can fine-tune LLaMA on your custom dataset using the LlamaIndex framework. The llama-index-finetuning
package includes various finetuning engines such as CohereRerankerFinetuneEngine
, EmbeddingAdapterFinetuneEngine
, GradientFinetuneEngine
, OpenAIFinetuneEngine
, and SentenceTransformersFinetuneEngine
[1].
Additionally, the framework is free to use as it is licensed under the MIT license [1].
To continue talking to Dosu, mention @dosu.
Thanks. How about if I want to use RAG instead of fine-tuning, can you help with that please?
To use Retrieval-Augmented Generation (RAG) instead of fine-tuning LLaMA on your custom dataset, follow these steps:
Setup Documents and Index:
Create documents and build a VectorStoreIndex
from them.
from llama_index.legacy import Document, VectorStoreIndex
from llama_index.legacy.retrievers import VectorIndexRetriever
# Create documents
documents = [
Document(text="A group of penguins, known as a 'waddle' on land, shuffled across the Antarctic ice, their tuxedo-like plumage standing out against the snow."),
Document(text="Emperor penguins, the tallest of all penguin species, can dive deeper than any other bird, reaching depths of over 500 meters."),
# Add more documents as needed
]
index = VectorStoreIndex.from_documents(documents)
# Setup a simple retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10,
)
Download and Setup Self-RAG Model:
Download the Self-RAG model and set up the SelfRAGQueryEngine
.
# Option: download llama_pack
from llama_index.legacy.llama_pack import download_llama_pack
download_llama_pack(
"SelfRAGPack",
"./self_rag_pack",
skip_load=True,
)
from self_rag_pack.base import SelfRAGQueryEngine
# Download the self-RAG model
download_dir = "/home/mmaatouk/tmp" # Replace with your directory
!pip3 install -q huggingface-hub
!huggingface-cli download m4r1/selfrag_llama2_7b-GGUF selfrag_llama2_7b.q4_k_m.gguf --local-dir {download_dir} --local-dir-use-symlinks False
from pathlib import Path
model_path = Path(download_dir) / "selfrag_llama2_7b.q4_k_m.gguf"
query_engine = SelfRAGQueryEngine(str(model_path), retriever, verbose=True)
Query the Model:
Use the SelfRAGQueryEngine
to query the model with or without retrieval.
# No retrieval example
response = query_engine.query("Which genre the book pride and prejudice?")
# Retrieval example
response = query_engine.query("How tall is the smallest penguins?")
For a more performant RAG application, consider techniques like decoupling chunks used for retrieval vs. synthesis, structured retrieval for larger document sets, dynamically retrieving chunks depending on your task, and optimizing context embeddings [1][2][3].
Additionally, you can use the RagCLI
tool from LlamaIndex to handle local document ingestion and querying via a command-line interface:
from llama_index.cli.rag.base import RagCLI
# Initialize RagCLI
rag_cli = RagCLI()
# Ingest documents
await rag_cli.handle_cli(files="path/to/your/documents", verbose=True)
# Query the ingested documents
await rag_cli.handle_cli(question="How tall is the smallest penguin?", verbose=True)
This setup allows you to efficiently manage document ingestion and querying, making it easier to implement RAG in your applications [4][5].
To continue talking to Dosu, mention @dosu.
which one is better and give better accuracy?
Using Retrieval-Augmented Generation (RAG) with LlamaIndex is generally more suitable and accurate for answering questions related to your custom dataset compared to fine-tuning LLaMA alone. The RAG approach integrates and retrieves relevant information from diverse data sources, enhancing the accuracy and relevance of the answers provided [1][2].
To continue talking to Dosu, mention @dosu.
So, in case I have data in PDF format, then how to extract that to train the LLMs using RAG above through llama_index framework? Any suggestion with data preparation, extract format?
To extract data from PDFs and prepare it for use with a Retrieval-Augmented Generation (RAG) setup using the LlamaIndex framework, you can use the PDFReader
class provided in the LlamaIndex framework. Below is an example of how you can use this class to load and parse PDF files, and then use the AutoMergingRetrieverPack
to handle the documents:
from pathlib import Path
from llama_index.readers.file.docs import PDFReader
from llama_index.packs.auto_merging_retriever import AutoMergingRetrieverPack
# Initialize the PDFReader
loader = PDFReader()
# Path to your PDF file
document_path = Path("path/to/your/file.pdf")
# Load data from the PDF file
documents = loader.load_data(file=document_path)
# Create the AutoMergingRetrieverPack with the loaded documents
auto_merging_retriever_pack = AutoMergingRetrieverPack(documents)
# Run the pack with a query
response = auto_merging_retriever_pack.run("Your query here")
print(response)
In this example:
PDFReader
is used to load and parse the PDF file.documents
will be a list of Document
objects, each containing the extracted text and metadata.AutoMergingRetrieverPack
is used to handle the documents and run queries against them.This extracted data can then be used in your RAG setup. Note that the PDFReader
class relies on the pypdf
library to read PDF files, so make sure to install it using pip install pypdf
if you haven't already [1][2][3][4].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
I want to fine-tune llama on my custome dataset so that is answers all questions related to my data. Can I do that with your framework please? Is it free to use?