aws / sagemaker-distribution

A set of Docker images that include popular frameworks for machine learning, data science and visualization.
Apache License 2.0
99 stars 53 forks source link

Inconsistent pydantic versions breaking with LangChain/LiteLLM/OpenAI #436

Open athewsey opened 5 months ago

athewsey commented 5 months ago

Category

Compatibility Issue

🐛 Describe the bug

I ran in to the issue described here today where trying to use litellm>=1.35.8,<2 on SageMaker Studio Distribution v1.8 fails with:

[Rest of stack trace abbreviated]

File /opt/conda/lib/python3.10/site-packages/openai/_models.py:21
      7 from typing_extensions import (
      8     Unpack,
      9     Literal,
   (...)
     17     runtime_checkable,
     18 )
     20 import pydantic
---> 21 import pydantic.generics
     22 from pydantic.fields import FieldInfo
     24 from ._types import (
     25     Body,
     26     IncEx,
   (...)
     33     HttpxRequestFiles,
     34 )

File /opt/conda/lib/python3.10/site-packages/pydantic/generics.py:2
      1 """The `generics` module is a backport module from V1."""
----> 2 from ._migration import getattr_migration
      4 __getattr__ = getattr_migration(__name__)

File /opt/conda/lib/python3.10/site-packages/pydantic/_migration.py:4
      1 import sys
      2 from typing import Any, Callable, Dict
----> 4 from .version import version_short
      6 MOVED_IN_V2 = {
      7     'pydantic.utils:version_info': 'pydantic.version:version_info',
      8     'pydantic.error_wrappers:ValidationError': 'pydantic:ValidationError',
   (...)
     13     'pydantic.generics:GenericModel': 'pydantic.BaseModel',
     14 }
     16 DEPRECATED_MOVED_IN_V2 = {
     17     'pydantic.tools:schema_of': 'pydantic.deprecated.tools:schema_of',
     18     'pydantic.tools:parse_obj_as': 'pydantic.deprecated.tools:parse_obj_as',
   (...)
     28     'pydantic.config:Extra': 'pydantic.deprecated.config:Extra',
     29 }

ImportError: cannot import name 'version_short' from 'pydantic.version' (/opt/conda/lib/python3.10/site-packages/pydantic/version.cpython-310-x86_64-linux-gnu.so)

Weirdly, I get different results in SageMaker for %pip show pydantic (v1.10.14) versus %conda list pydantic (v2.7.0).

Sure enough, trying to import pydantic.generics from a notebook fails with the above-mentioned error - including the Pydantic source location under /opt/conda.

From the stack trace it must be picking up v2.7.0 (Compare pydantic/generics.py @ 2.7.0 vs pydantic/generics.py @ 1.10.14) - but version.version_short should actually exist @ 2.7.0.

...And if I run the following from the same notebook:

import pydantic
pydantic.__version__

...It reports '1.10.14'!

If I %pip install pydantic==1.10.14, pip detects that the version is installed so there's nothing to do. If I restart the kernel, I get the same (reporting 1.10.14, erroring on import) behaviour as above.

...But if I run %pip install --force-reinstall pydantic==1.10.14 and restart the kernel, the ImportError gets resolved.

Can you guess what happens if I %pip install pydantic==2.7.0 and restart the kernel?

Well pydantic.__version__ == '2.7.0'... but, import pydantic.generics works just fine! 😭 This is true even on a fresh container where I hadn't run the force-reinstall of 1.10.14.

I even tried %conda install pydantic==2.7.0 --force-reinstall, but it just spins forever and then fails with PackagesNotFoundError: The following packages are not available from current channels for a looooong list of packages.

Based on all this - particularly the fact that installing either 1.10.14 or 2.7.0 seems to work - it really seems like something is wrong with the pydantic installation in SM Distribution: Maybe the two versions got installed on top of each other in the same location somehow and have conflicting files?

🐛 Describe the expected behavior

It'd be really useful if there was be exactly one version of pydantic visible to the Python 3 kernel, with installation mechanics that made some sense, and ideally if it's a version that correctly supports import pydantic.generics so that LangChain/LiteLLM/OpenAI libraries can work properly. 😅

It could just be that I'm missing something about how conda & pip are meant to work together in this environment? In which case would love to learn how to deal with it properly or if it could be documented somewhere easy to find!

Image Tags

SageMaker Studio Distribution v1.8 (2024-06-11)

athewsey commented 4 months ago

Just come across this issue again on SM Distribution v1.9 with a different tool: AWSLabs' agent-evaluation.

%pip install agent-evaluation

In this case %pip show pydantic yields 2.7.3 and %pip show pydantic-core shows 2.18.4 (those two are a valid combination)

...But running import pydantic.version; print(pydantic.version.version_info()) discovers pydantic at 2.8.2, which is incompatible with the installed pydantic-core and so breaks whenever anything tries to use Pydantic - even with a simple class like:

from pydantic import BaseModel

class Test(BaseModel, validate_assignment=True):
    name: str
    steps: list[str]

The resulting error message (with this test class or whenever I try to use agent-evaluation on SageMaker) is:

TypeError: list_schema() got an unexpected keyword argument 'fail_fast'

...Because Pydantic v2.8 introduced support for FailFast validation and the installed pydantic-core version is not compatible.

Please can somebody help us understand why these inconsistent versions are showing up in the SM Distribution container and how to properly install software to avoid these issues?

For now, I'm working around by explicitly re-installing "pydantic>=2.8,<3"

aws-tianquaw commented 3 months ago

Hi @athewsey, Thanks for reporting the issue! SageMaker Distribution images use micromamba to resolve dependencies. In the image v1.8 or v1.9, we have other packages (in this case, autogluon-0.x) that requires to use pydantic==1.x. Therefore, we are using pydantic==1.10.x, and pydantic_core==2.18.x which are compatible.

When you try to install agent-evaluation, you are upgrading the both versions of pydantic and pydantic-core and thereby causing the compatibility issue. I'd suggest to either install with constraint pydantic>=2.8,<3 or reach out to agent-evaluation package owner asking them to fix their requirements.

Please let me know if you have more questions, thanks!

aws-tianquaw commented 3 months ago

For the pydantic version discrepancy issue, it looks like the package has different set of version releases in PyPi and Conda-forge. In conda-forge, the latest version is 1.10.x(link), but seems like in PyPi, the latest is 2.8.2(link). SageMaker Distribution relies on Conda-like tool to resolve and install dependencies. If you are installing any new dependency with pip install, sometimes you might need to remove the original package installed with conda to avoid compatibility issue.

raghuvv commented 3 months ago

I have faced the same issue with latest SageMaker distribution (1.9) when simply trying to pip install openai and import it in my notebook

Error:

> ImportError: cannot import name 'version_short' from 'pydantic.version' (/opt/conda/lib/python3.10/site-packages/pydantic/version.cpython-310-x86_64-linux-gnu.so)

Workaround:

On the terminal.

> pip uninstall pydantic
 ... Successfully uninstalled pydantic-1.10.16

> pip install pydantic==1.10.8
> ... Found existing installation: pydantic 2.7.3
     Uninstalling pydantic-2.7.3:
      Successfully uninstalled pydantic-2.7.3
Successfully installed pydantic-1.10.8

Would be curious to know the correct / recommended way to install openai library on sagemaker studio space. Thanks!