Closed andysingal closed 11 months ago
🤖
Hello @andysingal, good to see you again! I see you have another issue to discuss. Let's dive into it!
The error message you're encountering, ValueError: 3 columns passed, passed data had 5 columns
, typically occurs when you're trying to create a pandas DataFrame and the number of columns you've specified doesn't match the number of columns in the data you're passing to the DataFrame constructor.
In your case, it seems like the function node_parser.get_nodes_from_documents(docs_2021)
is returning a data structure with 5 columns, but somewhere in your code (or in the LlamaIndex codebase), a DataFrame is being created with only 3 columns.
Unfortunately, without more information about the node_parser.get_nodes_from_documents(docs_2021)
function and how it's implemented in the LlamaIndex codebase, it's difficult to provide a more specific solution.
From the context provided, the get_nodes_from_documents
function in the HierarchicalNodeParser
class is used to parse a sequence of documents into a list of nodes. However, it's not clear how this data is structured and whether it's directly related to the error you're seeing.
If you could provide more details about the node_parser.get_nodes_from_documents(docs_2021)
function and the data it's returning, it would be easier to diagnose the issue. Specifically, it would be helpful to know the structure of docs_2021
and the output of node_parser.get_nodes_from_documents(docs_2021)
.
Also, if you could point to the specific line in the LlamaIndex codebase where the DataFrame is being created, that would also be very helpful. This error is likely occurring because the DataFrame constructor is expecting data with a different structure than what's being provided.
🤖
Hello @andysingal, good to see you again! I see you have another issue to discuss. Let's dive into it!
The error message you're encountering,
ValueError: 3 columns passed, passed data had 5 columns
, typically occurs when you're trying to create a pandas DataFrame and the number of columns you've specified doesn't match the number of columns in the data you're passing to the DataFrame constructor.In your case, it seems like the function
node_parser.get_nodes_from_documents(docs_2021)
is returning a data structure with 5 columns, but somewhere in your code (or in the LlamaIndex codebase), a DataFrame is being created with only 3 columns.Unfortunately, without more information about the
node_parser.get_nodes_from_documents(docs_2021)
function and how it's implemented in the LlamaIndex codebase, it's difficult to provide a more specific solution.From the context provided, the
get_nodes_from_documents
function in theHierarchicalNodeParser
class is used to parse a sequence of documents into a list of nodes. However, it's not clear how this data is structured and whether it's directly related to the error you're seeing.If you could provide more details about the
node_parser.get_nodes_from_documents(docs_2021)
function and the data it's returning, it would be easier to diagnose the issue. Specifically, it would be helpful to know the structure ofdocs_2021
and the output ofnode_parser.get_nodes_from_documents(docs_2021)
.Also, if you could point to the specific line in the LlamaIndex codebase where the DataFrame is being created, that would also be very helpful. This error is likely occurring because the DataFrame constructor is expecting data with a different structure than what's being provided.
Sources
Thanks for your response, the code is from your notebook: https://gpt-index.readthedocs.io/en/v0.9.3/examples/multi_modal/llava_multi_modal_tesla_10q.html
@andysingal I cannot reproduce. Do you already have 2021_nodes.pkl locally or you are generating it?
@andysingal I cannot reproduce. Do you already have 2021_nodes.pkl locally or you are generating it?
just running your colab notebook , i am generating 2021_nodes.pkl
ic I try to reproduce the error by running
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
but got different issue: https://github.com/run-llama/llama_index/pull/9134 Let us see wether they are correlated. But from your error log, seems a different issue
I'm having the same issue when running this line in your demo notebook here the tutorial notebook linked in your demo video
import os
import pickle
if not os.path.exists("2021_nodes.pkl"):
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))
Have you found a resolution for this?
@igygi could you rebase to the latest main branch and try again. I think those 2 issues should have been fixed.
I run into the same issue with the latest version of main. I implement a patch / PR that should fix it. https://github.com/run-llama/llama_index/pull/9249
Documentation Issue Description
While working on:
i get the following error:
Documentation Link
https://gpt-index.readthedocs.io/en/v0.9.3/examples/multi_modal/llava_multi_modal_tesla_10q.html