pydantic / pydantic-settings

Settings management using pydantic
https://docs.pydantic.dev/latest/usage/pydantic_settings/
MIT License
591 stars 61 forks source link

Settings parsing breaks with complex type #311

Closed ducminh-phan closed 3 months ago

ducminh-phan commented 3 months ago

Minimal example

main.py:

from typing import Annotated, Literal

from pydantic import BaseModel, Field, Json
from pydantic_settings import BaseSettings, SettingsConfigDict

class AWSS3(BaseModel):
    type: Literal["aws"]

class AzureBlobStorage(BaseModel):
    type: Literal["azure"]

Storage = Annotated[
    AWSS3 | AzureBlobStorage,
    Field(discriminator="type"),
]

class Settings(BaseSettings):
    storage: Json[Storage] | None = None

    model_config = SettingsConfigDict(
        env_file=".env",
    )

settings = Settings()
print(settings)

.env:

STORAGE='{
    "type": "aws"
}'

With pydantic@2.7.3, pydantic-settings@2.2.1, the script main.py produces output

storage=AWSS3(type='aws')

With pydantic@2.7.3, pydantic-settings@2.3.2, the script main.py produces the following error

Traceback (most recent call last):
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 368, in __call__
    field_value, field_key, value_is_complex = self.get_field_value(field, field_name)
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 534, in get_field_value
    for field_key, env_name, value_is_complex in self._extract_field_info(field, field_name):
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 273, in _extract_field_info
    elif origin_is_union(get_origin(field.annotation)) and _union_is_complex(field.annotation, field.metadata):
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 1621, in _union_is_complex
    return any(_annotation_is_complex(arg, metadata) for arg in get_args(annotation))
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 1621, in <genexpr>
    return any(_annotation_is_complex(arg, metadata) for arg in get_args(annotation))
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 1600, in _annotation_is_complex
    inner, meta = get_args(annotation)
ValueError: too many values to unpack (expected 2)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/.../main.py", line 29, in <module>
    settings = Settings()
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/main.py", line 141, in __init__
    **__pydantic_self__._settings_build_values(
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/main.py", line 311, in _settings_build_values
    return deep_update(*reversed([source() for source in sources]))
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/main.py", line 311, in <listcomp>
    return deep_update(*reversed([source() for source in sources]))
  File "/.../venv/lib/python3.10/site-packages/pydantic_settings/sources.py", line 370, in __call__
    raise SettingsError(
pydantic_settings.sources.SettingsError: error getting value for field "storage" from source "EnvSettingsSource"
hramezani commented 3 months ago

Thanks @ducminh-phan for reporting this.

I created https://github.com/pydantic/pydantic-settings/pull/312 to fix the problem.

Could you please confirm?

ducminh-phan commented 3 months ago

@hramezani Thanks for the quick fix. The code on the branch issue-311 works for me.

hramezani commented 3 months ago

The fix has been released in pydantic-settings 2.3.3

king-phyte commented 3 months ago

@hramezani Unless I am doing something wrong (maybe I am), this is broken for more than just annotated complex types like discriminitated unions. I am trying to parse a "json" environment variable (using v2.2.1) as JSON or dict, and it was failing. I upgraded to 2.3.3 right now to test this fix and it still does not work

Sample definition provided below:

.env file

GOOGLE_APPLICATION_CREDENTIALS='{
    "type": "service_account",
    "project_id": "project-id",
    "private_key_id": "fc5ba0",
    "private_key": "-----BEGIN PRIVATE KEY-----\n-<snip>=\n-----END PRIVATE KEY-----\n",
    "client_email": "123456789-compute@developer.gserviceaccount.com",
    "client_id": "123456789987654321",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://oauth2.googleapis.com/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/12345679-compute%40developer.gserviceaccount.com",
    "universe_domain": "googleapis.com"
}'

config.py

import pydantic
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    GOOGLE_APPLICATION_CREDENTIALS: pydantic.Json | None = None

    model_config = SettingsConfigDict(
        env_file=".env", extra="ignore", env_ignore_empty=True
    )

Error message:

  File "/Users/king/Library/Caches/pypoetry/virtualenvs/edok-backend-xjdhYfao-py3.12/lib/python3.12/site-packages/pydantic/main.py", line 176, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
GOOGLE_APPLICATION_CREDENTIALS
  Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value="'{", input_type=str]
    For further information visit https://errors.pydantic.dev/2.7/v/json_invalid
ducminh-phan commented 3 months ago

@king-phyte The env var name in .env file is GOOGLE_SERVICE_ACCOUNT_KEY while the field name in Settings is GOOGLE_APPLICATION_CREDENTIALS. Maybe GOOGLE_APPLICATION_CREDENTIALS is set by an existing environment variable? Changing the field name in Settings to GOOGLE_SERVICE_ACCOUNT_KEY works for me.

king-phyte commented 3 months ago

@ducminh-phan That was a mistake when copying the modified env vars. Thanks for drawing my attention to it. But in the actual code, the name is set correctly. You can see Pydantic tries to parse it, but it can't parse the multi-line string. It only parses the '{

GOOGLE_APPLICATION_CREDENTIALS Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value="'{", input_type=str]

ducminh-phan commented 3 months ago

@king-phyte The code still works for me. Have you checked that you don't have GOOGLE_APPLICATION_CREDENTIALS environment variable already set? Could you check os.environ?

king-phyte commented 3 months ago

@ducminh-phan It is not set. In fact, the variable is parsed correctly if it is all on one line.

So this works

GOOGLE_APPLICATION_CREDENTIALS='{"type": "service_account","project_id": "project-id","private_key_id": "fc5ba0","private_key": "-----BEGIN PRIVATE KEY-----\n-<snip>=\n-----END PRIVATE KEY-----\n","client_email":"123456789-compute@developer.gserviceaccount.com","client_id": "123456789987654321","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri": "https://oauth2.googleapis.com/token","auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/12345679-compute%40developer.gserviceaccount.com","universe_domain": "googleapis.com"}'

But when it is multi-line as shown above, it does not. Maybe I am doing something wrong indeed.