run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.48k stars 5.21k forks source link

[Question]: How to deal with JSON with image location #15911

Open Lu-Chengyu opened 1 month ago

Lu-Chengyu commented 1 month ago

Question Validation

Question

`def create_piazza_index(json_file_path, index_folder, levels_back=None, collapse_length=None, is_jsonl=False, clean_json=True):

Initialize the JSONReader with the desired parameters

json_reader = JSONReader(
    levels_back=levels_back,
    collapse_length=collapse_length,
    is_jsonl=is_jsonl,
    clean_json=clean_json
)

# Load data from the JSON file
documents = json_reader.load_data(json_file_path)

# If documents were loaded, create an index and persist it
if documents:
    index = VectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir=index_folder)`

I need to create a Q&A bot, and I have some previous Q&A data that includes both images and text. Previously, I only extracted the text and stored it in JSON format, using JSONReader to generate the document. Now, I want to include the images, so I have downloaded the images locally and added the image paths to the corresponding Q&A positions in the JSON. How should I process the images and JSON to associate the images with their relevant context in the JSON and generate a comprehensive index?

dosubot[bot] commented 1 month ago

Hey @Lu-Chengyu! I'm here to help you with any coding issues or questions you have. If you need assistance, feel free to ask!

To process the images and JSON to associate the images with their relevant context in the JSON and generate a comprehensive index using the create_piazza_index function, follow these steps:

  1. Download and Prepare Images: Ensure that the images are downloaded and you have a list of image file names.
  2. Extract Structured Outputs: Use Gemini to extract structured outputs from the images.
  3. Convert Structured Outputs to Nodes: Convert the structured outputs to TextNode objects.
  4. Index Nodes in Vector Stores: Index these nodes in a vector store.
  5. Define Auto-Retriever: Set up an auto-retriever for querying the indexed data.

Here is the code to achieve this:

# Step 1: Download and Prepare Images
image_files = get_image_files("path/to/your/images", sample=100)

# Step 2: Extract Structured Outputs
outputs = await aprocess_image_files(image_files)

# Step 3: Convert Structured Outputs to Nodes
nodes = get_nodes_from_objs(outputs, image_files)

# Step 4: Index Nodes in Vector Stores
import qdrant_client
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.gemini import GeminiEmbedding
from llama_index.llms.gemini import Gemini
from llama_index.core import Settings

# Create a local Qdrant vector store
client = qdrant_client.QdrantClient(path="qdrant_gemini")

vector_store = QdrantVectorStore(client=client, collection_name="collection")

# Global settings
Settings.embed_model = GeminiEmbedding(
    model_name="models/embedding-001", api_key=GOOGLE_API_KEY
)
Settings.llm = (Gemini(api_key=GOOGLE_API_KEY),)

storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex(
    nodes=nodes,
    storage_context=storage_context,
)

# Step 5: Define Auto-Retriever
from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo

vector_store_info = VectorStoreInfo(
    content_info="Receipts",
    metadata_info=[
        MetadataInfo(
            name="company",
            description="The name of the store",
            type="string",
        ),
        MetadataInfo(
            name="address",
            description="The address of the store",
            type="string",
        ),
        MetadataInfo(
            name="date",
            description="The date of the purchase (in DD/MM/YYYY format)",
            type="string",
        ),
        MetadataInfo(
            name="total",
            description="The final amount",
            type="float",
        ),
        MetadataInfo(
            name="currency",
            description="The currency of the country the purchase was made (abbreviation)",
            type="string",
        ),
    ],
)

from llama_index.core.retrievers import VectorIndexAutoRetriever

retriever = VectorIndexAutoRetriever(
    index,
    vector_store_info=vector_store_info,
    similarity_top_k=2,
    empty_query_top_k=10,  # if only metadata filters are specified, this is the limit
    verbose=True,
)

This code will process the images, extract structured outputs, convert them to nodes, index them, and set up an auto-retriever for querying the indexed data [1].

To continue talking to Dosu, mention @dosu.