run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.39k stars 4.67k forks source link

[Bug]: Can't instantiate abstract class BaseNode with abstract methods #10697

Closed gingerwizard closed 4 months ago

gingerwizard commented 4 months ago

Bug Description

Any vector store which returns a TextNode to the query method, i get the following error

File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for NodeWithScore node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)

Version

v0.10.3

Steps to Reproduce

VectorStoreIndex.from_vector_store(vector_store)

engine = clickhouse_vector_store().as_query_engine(
    similarity_top_k=10)
response = engine.query(prompt)

Relevant Logs/Tracbacks

File "/Users/dalemcdiarmid/Library/Caches/pypoetry/virtualenvs/llama-index-xtW50Fas-py3.11/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
    exec(code, module.__dict__)
File "/opt/llama_index/docs/examples/apps/hacker_insights.py", line 69, in <module>
    response = query(prompt)
               ^^^^^^^^^^^^^
File "/Users/dalemcdiarmid/Library/Caches/pypoetry/virtualenvs/llama-index-xtW50Fas-py3.11/lib/python3.11/site-packages/streamlit/runtime/caching/cache_utils.py", line 212, in wrapper
    return cached_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dalemcdiarmid/Library/Caches/pypoetry/virtualenvs/llama-index-xtW50Fas-py3.11/lib/python3.11/site-packages/streamlit/runtime/caching/cache_utils.py", line 241, in __call__
    return self._get_or_create_cached_value(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dalemcdiarmid/Library/Caches/pypoetry/virtualenvs/llama-index-xtW50Fas-py3.11/lib/python3.11/site-packages/streamlit/runtime/caching/cache_utils.py", line 268, in _get_or_create_cached_value
    return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dalemcdiarmid/Library/Caches/pypoetry/virtualenvs/llama-index-xtW50Fas-py3.11/lib/python3.11/site-packages/streamlit/runtime/caching/cache_utils.py", line 324, in _handle_cache_miss
    computed_value = self._info.func(*func_args, **func_kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/docs/examples/apps/hacker_insights.py", line 41, in query
    response = engine.query(prompt)
               ^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/base/base_query_engine.py", line 40, in query
    return self._query(str_or_query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/query_engine/retriever_query_engine.py", line 186, in _query
    nodes = self.retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/query_engine/retriever_query_engine.py", line 142, in retrieve
    nodes = self._retriever.retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/base/base_retriever.py", line 229, in retrieve
    nodes = self._retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/indices/vector_store/retrievers/retriever.py", line 94, in _retrieve
    return self._get_nodes_with_embeddings(query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/indices/vector_store/retrievers/retriever.py", line 171, in _get_nodes_with_embeddings
    return self._build_node_list_from_query_result(query_result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/llama_index/llama-index-core/llama_index/core/indices/vector_store/retrievers/retriever.py", line 162, in _build_node_list_from_query_result
    node_with_scores.append(NodeWithScore(node=node, score=score))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
dosubot[bot] commented 4 months ago

πŸ€–

The error you're encountering, "Can't instantiate abstract class BaseNode with abstract methods", typically occurs when you're trying to instantiate an abstract base class that contains abstract methods. In Python, an abstract class is a class that contains one or more abstract methods, which are methods declared in an abstract class but do not contain any implementation.

In your case, it seems like the BaseNode class is being used, but we don't have the implementation details of this class. If BaseNode is an abstract class with abstract methods, you cannot create an instance of it directly. Instead, you should create a subclass of BaseNode and implement all of its abstract methods.

However, in the provided code, BaseNode is used as a type hint in the add method and TextNode is instantiated in the query method. If TextNode is a subclass of BaseNode and it implements all abstract methods of BaseNode, it should not raise the error you mentioned.

To resolve this issue, you need to check the implementation of BaseNode and TextNode classes. Make sure TextNode is a subclass of BaseNode and it implements all abstract methods of BaseNode. If BaseNode or TextNode is not implemented correctly, you need to fix it according to your application requirements.

Unfortunately, I wasn't able to find the exact structure and implementation of the BaseNode class in the LlamaIndex codebase. It would be helpful if you could provide more information about where these classes are defined or how they are implemented. This would allow me to give a more accurate answer to your question.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a πŸ‘ if this is helpful and πŸ‘Ž if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
ayseozgun commented 4 months ago

Hi, I am also getting the similar error:

ValidationError: 1 validation error for NodeWithScore node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)

index = VectorStoreIndex.from_vector_store(vector_store=vector_store) engine = index.as_chat_engine(similarity_top_k= 10, chat_mode="condense_plus_context", llm=llm, context_prompt=context_prompt, verbose=False, ) engine.chat(question)

I was using version 9, i am trying to adjust my codes with new version 10.

nerdai commented 4 months ago

Thanks @ayseozgun and @gingerwizard for raising. Sorry that you're both running into issues on this! I'm starting to dig on the issue(s) at hand.

Can I just confirm with you both that you are each using a fresh new virtual environment with v0.10? Any previous install of llama-index lingering in your virtual envs will certainly cause some issues.

gingerwizard commented 4 months ago

~I think this is because the vector stores don't inherit from BasePydanticVectorStore @nerdai~

ayseozgun commented 4 months ago

Thanks for the responses. Yes I am using version 0.10.3

nerdai commented 4 months ago

I can't seem to replicate this. My suspicion here tho is that there is a legacy install of llama-index in the virtual environments that's causing our issues.

Here's what I just tried, and it worked for me (I am using qdrant as vector store and building index from it, similar to what you have both done here):

Create a new virtual environment, install core + additional packages

pyenv virtualenv venv
pyenv activate venv
pip install llama-index llama-index-vector-stores-qdrant

In shell, download data:

mkdir -p 'data/paul_graham/'
wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Python script: test.py

import sys
import os

import qdrant_client
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.qdrant import QdrantVectorStore

client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="paul_graham")
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
node_parser = SentenceSplitter.from_defaults()
nodes = node_parser.get_nodes_from_documents(documents)
index.insert_nodes(nodes)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

Execute script in shell, while venv is active

export OPENAI_API_KEY=<openai-api-key> && python test.py
gingerwizard commented 4 months ago

Whats bizarre is its sproadic, it will work for a few queries and then fail vs just failing all the time. I was wondering if its a property of the TextNodes being returned @nerdai that somehow violate the datamodel.

gingerwizard commented 4 months ago

I'll reset my virtual env to see if this is the issue i.e.

poetry env remove llama-index-xtW50Fas-py3.11
poetry env use python3
poetry install
nerdai commented 4 months ago

Thanks @gingerwizard, please do try removing the virtual env completely. Hopefully global python3 doesn't have a legacy llama-index installed either.

If it continues to persist, please do share more of your code and data (if possible) so I can debug on my end.

darrenwwx commented 4 months ago

I get the same error too when using the BM25 retriever with the pinecone vector Store. It works sometimes and crashes sometimes.

OMER62 commented 4 months ago

I encountered the same error too

ayseozgun commented 4 months ago

I am running the codes on sagemaker studio.

llama-index 0.10.3 llama-index-agent-openai 0.1.1 llama-index-core 0.10.3 llama-index-embeddings-openai 0.1.1 llama-index-legacy 0.9.48 llama-index-llms-anyscale 0.1.1 llama-index-llms-langchain 0.1.1 llama-index-llms-openai 0.1.1 llama-index-multi-modal-llms-openai 0.1.1 llama-index-program-openai 0.1.1 llama-index-question-gen-openai 0.1.1 llama-index-readers-file 0.1.3

python version 3.10.6

The versions are correct??

nerdai commented 4 months ago

I am running the codes on sagemaker studio.

`llama-index 0.10.3

llama-index-agent-openai 0.1.1

llama-index-core 0.10.3

llama-index-embeddings-openai 0.1.1

llama-index-legacy 0.9.48

llama-index-llms-anyscale 0.1.1

llama-index-llms-langchain 0.1.1

llama-index-llms-openai 0.1.1

llama-index-multi-modal-llms-openai 0.1.1

llama-index-program-openai 0.1.1

llama-index-question-gen-openai 0.1.1

llama-index-readers-file 0.1.3`

python version 3.10.6

The versions are correct??

They look to be correct. But did you install this on a brand new virtual environment?

nerdai commented 4 months ago

I get the same error too when using the BM25 retriever with the pinecone vector Store. It works sometimes and crashes sometimes.

Thanks for the information. Will try with similar setup on my end soon. Before that confirming that you're running on a brand new virtual environment with all LlamaIndex packages?

ayseozgun commented 4 months ago

I started new instance and kernel on sagemaker studio, and installed the llama index. So kinda new env @nerdai

garrettmaring commented 4 months ago

0.10.4

  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/base/base_retriever.py", line 229, in retrieve
    nodes = self._retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 94, in _retrieve
    return self._get_nodes_with_embeddings(query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 171, in _get_nodes_with_embeddings
    return self._build_node_list_from_query_result(query_result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 162, in _build_node_list_from_query_result
    node_with_scores.append(NodeWithScore(node=node, score=score))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)
gingerwizard commented 4 months ago

Resolved for me when i completely cleaned the env.

nerdai commented 4 months ago

I encountered the same error too

Did you install llama-index and other packages on a fresh venv? Any legacy llama-index lingering around will break things.

nerdai commented 4 months ago

0.10.4

  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/base/base_retriever.py", line 229, in retrieve
    nodes = self._retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 94, in _retrieve
    return self._get_nodes_with_embeddings(query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 171, in _get_nodes_with_embeddings
    return self._build_node_list_from_query_result(query_result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 162, in _build_node_list_from_query_result
    node_with_scores.append(NodeWithScore(node=node, score=score))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/g/.venvs/engi/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)

Hi @garrettmaring, starting with a clean env has resolved the issue for some. Can you trying using a brand new env before installing llama-index and your other llama-index packages?

# pyenv
pyenv virtualenv venv
pyenv activate venv
pip install llama-index

# venv
python -m venv venv
source venv/bin/activate
pip install llama-index

# conda
conda create -n llama python=3.11 anaconda # can use any Python 3.9 - 3.12 
conda activate llama
pip install llama-index
nerdai commented 4 months ago

@ayseozgun: I tested with your set up {llm: Anthropic, embed: Titan, vector_store: pinecone} and it worked for me using a new venv.

requirements.txt

llama-index
llama-index-llms-anthropic
llama-index-embeddings-titan
llama-index-vector-stores-pinecone
boto3

create fresh venv and install requirements

pyenv virtualenv venv
pyenv activate venv
pip install -r requirements.txt

(data as downloaded in my previously shared example above)

main.py

import sys
import os

from pinecone import Pinecone, ServerlessSpec
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.llms.anthropic import Anthropic
from llama_index.embeddings.bedrock import BedrockEmbedding

# create index
api_key = os.environ["PINECONE_API_KEY"]
pc = Pinecone(api_key=api_key)
pinecone_index = pc.Index("quickstart")

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# build vector store index
embed_model = BedrockEmbedding.from_credentials(
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    aws_session_token=os.getenv("AWS_SESSION_TOKEN"),
    aws_region="<region>",
    aws_profile="<profile>"
)
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store,
    embed_model=embed_model,
)

# add nodes
node_parser = SentenceSplitter.from_defaults()
nodes = node_parser.get_nodes_from_documents(documents)
index.insert_nodes(nodes)

# build query engine
llm = Anthropic()
query_engine = index.as_query_engine(llm=llm)

# query
response = query_engine.query("Tell me something about Paul")
print(response)

After exporting the necessary API Keys, I run the script with usual python main.py and it produces a response

Upserted vectors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 22/22 [00:00<00:00, 40.29it/s]
 Unfortunately I do not have enough context to provide a substantive answer about Paul. The passage discusses the history and development of the Lisp programming language, but does not contain information directly relevant to answering a query about Paul. Without additional context about who Paul is or how he relates to Lisp or computer science history, I cannot reliably tell you something about him. I apologize that I cannot be more helpful in answering your query.
nerdai commented 4 months ago

@ayseozgun after the two working examples shared above, and given that @gingerwizard was able to resolve with a fresh env, I am more inclined to think this is a venv issue.

I am not sure how studio works, having never used it myself before, but if you can run pip install on there then maybe you can try uninstalling everything, then installing everything you need

pip freeze | xargs pip uninstall -y
pip install llama-index
pip install llama-index-embeddings-bedrock
etc.

The first line will delete everything, and then you can just reinstall everything you need agian.

shankartmv commented 4 months ago

I have been running in to this issue quite often over the last few days, but I admit that the issue is sporadic.

ValidationError: 1 validation error for IngestionPipeline transformations -> 2 Can't instantiate abstract class TransformComponent with abstract method call (type=type_error)

Here is the version of llama-index-* packages installed on my google colab .

llama-index-core 0.10.5 llama-index-embeddings-huggingface 0.1.1 llama-index-llms-azure-openai 0.1.1 llama-index-llms-openai 0.1.1 llama-index-readers-file 0.1.3 llama-index-vector-stores-qdrant 0.1.1

Despite after multiple attempts of "disconnect and deleting" the runtime , I am still running in to this error.

Here is the exact piece of code that's triggering the exception.

qdrnt_vector_store=QdrantVectorStore(client=qdrnt_client,collection_name="1000_pdf_repo",batch_size=25,enable_hybrid=True) pipeline = IngestionPipeline( transformations=[ SentenceSplitter(chunk_size=1024, chunk_overlap=20), TitleExtractor(llm=llm), bge_embeddings, ], vector_store=qdrnt_vector_store, ) pipeline.run(documents=[pdf_document],show_progress=True)

This piece of code was functional till 15-FEB-24 IST 19:00.

karmakarh commented 4 months ago

I am also facing the same issue, this is consistent for me. Here is my code -

from llama_index.core.llms import LLM from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings, StorageContext, get_response_synthesizer from llama_index.core.node_parser import SentenceSplitter import chromadb from llama_index.legacy.vector_stores.chroma import ChromaVectorStore from llama_index.core.retrievers import VectorIndexRetriever from llama_index.core.query_engine import RetrieverQueryEngine from llama_index.core.postprocessor import SimilarityPostprocessor


def execute_query(question, index):

retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

response_synthesizer=get_response_synthesizer()
query_engine = RetrieverQueryEngine(retriever=retriever,
                                    response_synthesizer=response_synthesizer,
                                    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],

                                    )
response = query_engine.query(question)
#response= index.query(query, exclude_keywords=[""], required_keywords=[""], response_mode="")
return response

question="How to Connect to ThoughtSpot with UDH?" response = execute_query(question,index) response.print_response_stream()


ValidationError: 1 validation error for NodeWithScore node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)

Any help in fixing the issue would be of great help.

nerdai commented 4 months ago

I have been running in to this issue quite often over the last few days, but I admit that the issue is sporadic.

ValidationError: 1 validation error for IngestionPipeline transformations -> 2 Can't instantiate abstract class TransformComponent with abstract method call (type=type_error)

Here is the version of llama-index-* packages installed on my google colab .

llama-index-core 0.10.5 llama-index-embeddings-huggingface 0.1.1 llama-index-llms-azure-openai 0.1.1 llama-index-llms-openai 0.1.1 llama-index-readers-file 0.1.3 llama-index-vector-stores-qdrant 0.1.1

Despite after multiple attempts of "disconnect and deleting" the runtime , I am still running in to this error.

Here is the exact piece of code that's triggering the exception.

qdrnt_vector_store=QdrantVectorStore(client=qdrnt_client,collection_name="1000_pdf_repo",batch_size=25,enable_hybrid=True) pipeline = IngestionPipeline( transformations=[ SentenceSplitter(chunk_size=1024, chunk_overlap=20), TitleExtractor(llm=llm), bge_embeddings, ], vector_store=qdrnt_vector_store, ) pipeline.run(documents=[pdf_document],show_progress=True)

This piece of code was functional till 15-FEB-24 IST 19:00.

@shankar24x7: Would you be able to share the google colab with me? At this point, I have not been able to reproduce this error when using a fresh python environment. The original author of this issue reported here already that this was resolved when using a fresh env.

nerdai commented 4 months ago

@karmakarh:

Did you start with a brand new virtual environment?

karmakarh commented 4 months ago

Yes, I started a brand new venv for the notebook

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Andrei Fajardo @.> Sent: Friday, February 16, 2024 8:47:17 PM To: run-llama/llama_index @.> Cc: HIMANGSHU KARMAKAR @.>; Mention @.> Subject: [EXTERNAL] Re: [run-llama/llama_index] [Bug]: Can't instantiate abstract class BaseNode with abstract methods (Issue #10697)

@karmakarh: Did you start with a brand new virtual environment? β€” Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.β€ŠMessage ID: run-llama/llama_index/issues/10697/1948584689@β€Šgithub.β€Šcom ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!2K-pihW84TIQkY-YMwrPpd0VujEFc8ZsvOwTLdP4-O4l6eD2aiuaUVT9TZR9mrme0rcJPBlA3iYtZHv7w9SryqQP3hgEoy8pHgPb9keqMvLmQWypNlJX3g$ Report Suspicious

ZjQcmQRYFpfptBannerEnd

@karmakarhhttps://github.com/karmakarh:

Did you start with a brand new virtual environment?

β€” Reply to this email directly, view it on GitHubhttps://github.com/run-llama/llama_index/issues/10697#issuecomment-1948584689, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BDEJBVUFRSGVRWD3AWYV2S3YT5Z73AVCNFSM6AAAAABDICOXWGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBYGU4DINRYHE. You are receiving this because you were mentioned.Message ID: @.***>

shankartmv commented 4 months ago

I have been running in to this issue quite often over the last few days, but I admit that the issue is sporadic. ValidationError: 1 validation error for IngestionPipeline transformations -> 2 Can't instantiate abstract class TransformComponent with abstract method call (type=type_error) Here is the version of llama-index-* packages installed on my google colab . llama-index-core 0.10.5 llama-index-embeddings-huggingface 0.1.1 llama-index-llms-azure-openai 0.1.1 llama-index-llms-openai 0.1.1 llama-index-readers-file 0.1.3 llama-index-vector-stores-qdrant 0.1.1 Despite after multiple attempts of "disconnect and deleting" the runtime , I am still running in to this error. Here is the exact piece of code that's triggering the exception. qdrnt_vector_store=QdrantVectorStore(client=qdrnt_client,collection_name="1000_pdf_repo",batch_size=25,enable_hybrid=True) pipeline = IngestionPipeline( transformations=[ SentenceSplitter(chunk_size=1024, chunk_overlap=20), TitleExtractor(llm=llm), bge_embeddings, ], vector_store=qdrnt_vector_store, ) pipeline.run(documents=[pdf_document],show_progress=True) This piece of code was functional till 15-FEB-24 IST 19:00.

@shankar24x7: Would you be able to share the google colab with me? At this point, I have not been able to reproduce this error when using a fresh python environment. The original author of this issue reported here already that this was resolved when using a fresh env.

@nerdai I have added you as a collaborator to my notebook in github

nerdai commented 4 months ago

@shankartmv thanks, i tried accepting the invite, but I got a 404 error. Do you mind sending me a new one?

shankartmv commented 4 months ago

@nerdai I have sent you a new one as requested.

nerdai commented 4 months ago

@shankartmv I took a look at your notebook. Your error is not the same as the one posted here as your's has to do with the TransformComponent class rather than the BaseNode class which this issue is about. I filed a new issue (#10912) on your behalf. I have the resolution to your issue and will share there.

garrettmaring commented 4 months ago

hm, i've uninstalled, removed pycache, and created a fresh env on python 3.11 and getting a similar base class error when importing IngestionPipeline

1 validation error for IngestionPipeline
vector_store
  Can't instantiate abstract class BasePydanticVectorStore with abstract methods add, client, delete, query (type=type_error).
garrettmaring commented 4 months ago

ah, this issue was using llama_index.legacy for the vector store index (Pinecone). its running now with the fixed import πŸ‘

ghassett commented 4 months ago

I am seeing this error today, with a freshly built virtual environment.

I am running the following packages on my Mac M1. A sample code fragment follows.

$ pip freeze | grep llama-    
llama-index==0.10.5
llama-index-agent-openai==0.1.1
llama-index-core==0.10.5
llama-index-embeddings-huggingface==0.1.1
llama-index-embeddings-openai==0.1.1
llama-index-legacy==0.9.48
llama-index-llms-ollama==0.1.1
llama-index-llms-openai==0.1.2
llama-index-multi-modal-llms-openai==0.1.1
llama-index-program-openai==0.1.1
llama-index-question-gen-openai==0.1.1
llama-index-readers-file==0.1.3
llama-index-readers-web==0.1.5

Here is sample code that results in the exception:

from dotenv import load_dotenv
from llama_index.core import VectorStoreIndex, Settings, StorageContext
from llama_index.legacy.vector_stores import ChromaVectorStore
from llama_index.llms.openai import OpenAI
from llama_index.readers.web import SimpleWebPageReader

import chromadb

load_dotenv('environments/developer/greg.env') # get my openai key

loader = SimpleWebPageReader()
documents = loader.load_data(urls=['https://en.wikipedia.org/wiki/Abraham_Lincoln'])

Settings.llm = OpenAI()

db = chromadb.PersistentClient(path='./chromadb')
chroma_collection = db.get_or_create_collection('sample-collection')
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()

response = query_engine.query("What is this article about?")
print(f"RESPONSE:\n{response}")
nerdai commented 4 months ago

@ghassett please don't mix llama_index.legacy imports with the new structure. It's either you go for legacy completely or use new structure completely.

For your case, you need to install llama-index-vector-stores-chroma

And import using:

from llama_index.vector_stores.chroma import ChromaVectorStore

ghassett commented 4 months ago

That did the trick! Thank you, I'll make sure not to import from legacy packages.

nerdai commented 4 months ago

Closing this issue as the original author has reported a fresh env has resolved the problem.

Recommendation:

# if using the python global interpreter
pip uninstall llama-index
pip install llama-index -upgrade 

Or with brand new venv

python -m venv venv
source venv/bin/activate
pip install llama-index
pip install <other llama-index packages>
karmakarh commented 4 months ago

In my case the issue was a mix of legacy package in my code. When I removed legacy package and imported new package, it resolved my issue.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Andrei Fajardo @.> Sent: Monday, February 19, 2024 10:17:37 PM To: run-llama/llama_index @.> Cc: HIMANGSHU KARMAKAR @.>; Mention @.> Subject: [EXTERNAL] Re: [run-llama/llama_index] [Bug]: Can't instantiate abstract class BaseNode with abstract methods (Issue #10697)

Closed #10697 as completed. β€” Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.β€ŠMessage ID: run-llama/llama_index/issue/10697/issue_event/11852271812@β€Šgithub.β€Šcom ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!1y-pihW84TIQkY-YMwrPpfe7FxP5RHiKl0F2mh2GUbCjB14UmOY6uaCZNb_CiqYeOw7y37WPTsaXkcXV4uMUjnwpjFOmaaU5h_PsQ05mfS55-Ly1QgQEPNus7i1ioeBaio3Vvw$ Report Suspicious

ZjQcmQRYFpfptBannerEnd

Closed #10697https://github.com/run-llama/llama_index/issues/10697 as completed.

β€” Reply to this email directly, view it on GitHubhttps://github.com/run-llama/llama_index/issues/10697#event-11852271812, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BDEJBVWJ6J2VDBADMVY32N3YUN62TAVCNFSM6AAAAABDICOXWGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRHA2TEMRXGE4DCMQ. You are receiving this because you were mentioned.Message ID: @.***>

amindadgar commented 3 months ago

I had a similar issue and was forced to use the legacy version alongside the newer version (because of another service apache airflow). So to fix it, I rewrite the retriever myself and made a part of it like this

            node_new = Node.from_dict(node.to_dict())
            node_with_score = NodeWithScore(node=node_new, score=score)
            nodes_with_scores.append(node_with_score)

I didn't investigate why this was happening (maybe the old NodeWithScore had differences with newer one). In any case, it did fix the issue in my code. @logan-markewich I'm happy to open a PR to just do this change so no more people encounter this problem.

logan-markewich commented 3 months ago

@amindadgar the issue is not the source code, the issue is mixing legacy with non-legacy imports

isinstance checks and similar break if the types are mixed from legacy and non legacy imports

hvico commented 3 months ago

Hi. I am having the same issue because of the mixup of new/legacy imports:

from llama_index.legacy.retrievers import BM25Retriever from llama_index.core.retrievers import QueryFusionRetriever

I am trying to follow / migrate this example to the new API:

https://medium.aiplanet.com/setting-up-query-pipeline-for-advanced-rag-workflow-using-llamaindex-666ddd7d0d41

I don't see the BM25Retriever inside the new import structure, the class is just defined in the legacy branch. So how I am supposed to load BM25Retriever in version 0.10.x ??

Thanks in advance.

logan-markewich commented 3 months ago

@hvico it's it's own package, every integration is a package.

It's listed in the docs and in https://llamahub.ai

pip install llama-index-retrievers-bm25 from llama_index.retrievers.bm25 import BM25Retriever

hvico commented 3 months ago

Thanks, got it!