run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.38k stars 4.67k forks source link

[Feature Request]: default to pydantic v2 #13477

Open niderhoff opened 1 month ago

niderhoff commented 1 month ago

Feature Description

By default llama_index is using pydantic v1 Models. This leads to issues when users try to use llamaindex pydantic Models as part of their application where users are probably using pydantic v2 api.

For example I am using the current fastapi which uses pydantic v2.7.1. and I cannot use llamaindex models in my API due to this error, I have to copy the llamaindex Models manually and re-create them with pydantic v2 in my code, which goes against DRY principle and also is a maintenance effort.

https://github.com/run-llama/llama_index/blob/133cc30bae32a7e23372ccd08aba8fec5cec2bea/llama-index-core/llama_index/core/bridge/pydantic.py#L2-L17

Reason

Considering that is deprecated for a long time, wouldn't it make sense to upgrade to pydantic v2 api?

Value of Feature

Better compatibility with the current status of python ecosystem. Better access for integrations.

DarkLight1337 commented 1 month ago

Pydantic v2 is also much faster than v1. I have noticed that Pydantic v1 takes a long time to validate the List[List[float]] embeddings inside EmbeddingEndEvent as well as when setting the embedding attribute of an existing BaseNode. Perhaps v2 would help alleviate this issue without the need to disable validation altogether.

parnell commented 1 month ago

Just chiming in that this is becoming an increasingly large problem as more and more people and libraries use Pydantic v2.