nasirus / llama_index

MIT License
1 stars 0 forks source link

ValueError: chroma_collection is required when composing graph from GPTChromaIndex #5

Closed nasirus closed 1 year ago

nasirus commented 1 year ago

Hey team.

I'm trialing producing graphs from multiple GPTChromaIndex elements.

Each individual chroma index is a different collection within the same chroma database. They are sourced in this case from GitHub repos.

It is produced like this:

index = GPTChromaIndex.from_documents(docs_branch, service_context=service_context, chroma_collection=chroma_collection) index.save_todisk(f'./index{collection_name}.json') When querying in my code, it works fine if I just use each collection as a IndexToolConfig and pass them in to a new LlamaToolkit object.

index_names = [i["name"] for i in index_summaries]

for i in index_summaries:

collection = i["collection"]
print(collection)

chroma_client = chromadb.Client( Settings(
        persist_directory='../data/chromadata_sample',
        chroma_db_impl="duckdb+parquet",
    ))

chroma_collection = chroma_client.get_or_create_collection(collection)
#cur_index = GPTSimpleVectorIndex.load_from_disk(f'index_edifice.json', service_context=service_context)
cur_index = GPTChromaIndex.load_from_disk(f'./index_{collection}.json', chroma_collection=chroma_collection, service_context=service_context)
indices.append(cur_index)

print(len(cur_index.docstore.ref_doc_info))

tool_config = IndexToolConfig(
    index=cur_index, 

    name=i["name"],
    description=i["description"],
    index_query_kwargs={"similarity_top_k": 5},
    tool_kwargs={"return_direct": True}
)

index_configs.append(tool_config)

However when I try to use a composed graph over the same collections I get an error.

File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in cli.main() File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main run() File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file runpy.run_path(target, run_name="main") File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path return _run_module_code(code, init_globals, run_name, File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code _run_code(code, mod_globals, init_globals, File "/home/vscode/.vscode-server-insiders/extensions/ms-python.python-2023.6.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code exec(code, run_globals) File "/workspaces/cseGPT/autoripper/autoripper_tester.py", line 204, in response = agent_chain.run(input=text_input) File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 216, in run return self(kwargs)[self.output_keys[0]] File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 116, in call raise e File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 113, in call outputs = self._call(inputs) File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 637, in _call next_step_output = self._take_next_step( File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 569, in _take_next_step observation = tool.run( File "/usr/local/lib/python3.10/site-packages/langchain/tools/base.py", line 73, in run raise e File "/usr/local/lib/python3.10/site-packages/langchain/tools/base.py", line 70, in run observation = self._run(tool_input) File "/usr/local/lib/python3.10/site-packages/llama_index/langchain_helpers/agents/tools.py", line 89, in _run response = self.graph.query(tool_input, query_configs=query_configs) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/composability/graph.py", line 145, in query return query_runner.query(query_str) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 342, in query return query_combiner.run(query_bundle, level) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_combiner/base.py", line 66, in run return self._query_runner.query_transformed( File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 192, in query_transformed node_with_score, source_nodes = self._fetch_recursive_nodes( File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 219, in _fetch_recursive_nodes response = self.query(query_bundle, index_node.index_id, level + 1) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 342, in query return query_combiner.run(query_bundle, level) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_combiner/base.py", line 66, in run return self._query_runner.query_transformed( File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 182, in query_transformed query_obj = self._get_query_obj(index_struct) File "/usr/local/lib/python3.10/site-packages/llama_index/indices/query/query_runner.py", line 167, in _get_query_obj query_obj = query_cls( File "/usr/local/lib/python3.10/site-packages/llama_index/indices/vector_store/queries.py", line 203, in init raise ValueError("chroma_collection is required.") ValueError: chroma_collection is required. The graph config is as follows:

graph = ComposableGraph.from_indices( GPTListIndex, indices, index_summaries=index_names, service_context=service_context, )

decompose_transform = DecomposeQueryTransform( llm_predictor, verbose=True )

define query configs for graph

query_configs = [ { "index_struct_type": "simple_dict", "query_mode": "default", "query_kwargs": { "similarity_top_k": 1,

"include_summary": True

    },
    "query_transform": decompose_transform
},
{
    "index_struct_type": "list",
    "query_mode": "default",
    "query_kwargs": {
        "response_mode": "tree_summarize",
        "verbose": True
    }
},    
{
    "index_struct_type": "tree",
    "query_mode": "default",
    "query_kwargs": {
        "verbose": True
    },
},

]

graph config

graph_config = GraphToolConfig( graph=graph, name=f"Graph Index", description="useful for when you want to answer queries that require analyzing multiple documents for the Edifice Project.", query_configs=query_configs, tool_kwargs={"return_direct": False} ) I'd pass in the chroma_collection as kwargs, but it doesn't make sense as each separate sub index has a different collection.

Any advice appreciated!