pydantic / pydantic-settings

Settings management using pydantic
https://docs.pydantic.dev/latest/usage/pydantic_settings/
MIT License
593 stars 61 forks source link

Override default file in Yaml/Toml/Json ConfigSettingsSource at runtime. #259

Open theelderbeever opened 6 months ago

theelderbeever commented 6 months ago

I would like to override the defaulted config file path used in a file settings source. In the example below the default yaml file that is read in is /etc/config.yaml however, oftentimes I like to be able to pass the config file location in at application startup via a cli arg. Is there an ergonomic way to update this at runtime?

import os
from typing import Tuple, Type

from pydantic import BaseModel, Field

from pydantic_settings import (
    BaseSettings,
    PydanticBaseSettingsSource,
    YamlConfigSettingsSource,
    EnvSettingsSource,
)

class Nested(BaseModel):
    nested_field: str = Field(default=...)

class Settings(BaseSettings):
    foobar: str = Field(default=...)
    nested: Nested = Field(default=...)

    @classmethod
    def settings_customise_sources(
        cls, settings_cls: Type[BaseSettings], **kwargs
    ) -> Tuple[PydanticBaseSettingsSource, ...]:
        return (
            EnvSettingsSource(
                settings_cls,
                env_prefix="APP__",
                env_nested_delimiter="__",
                case_sensitive=False,
            ),
            YamlConfigSettingsSource(settings_cls, yaml_file="/etc/config.yaml"),
        )

print(Settings().model_dump())
hramezani commented 6 months ago

Thanks @theelderbeever for reporting this.

Right now there is no clear way to pass the file. We have to make it possible to pass the file like Settings(_yaml_file='<file_name>') but it will be a breaking change. So, we have to postpone it until V3.

BTW, you can do it in a hacky way like:

import os
import sys
from typing import Tuple, Type

from pydantic import BaseModel, Field

from pydantic_settings import (
    BaseSettings,
    PydanticBaseSettingsSource,
    YamlConfigSettingsSource,
    EnvSettingsSource,
)

YAML_FILE_PATH = None

class Nested(BaseModel):
    nested_field: str = Field(default=...)

class Settings(BaseSettings):
    foobar: str = Field(default=...)
    nested: Nested = Field(default=...)

    @classmethod
    def settings_customise_sources(
        cls, settings_cls: Type[BaseSettings], **kwargs
    ) -> Tuple[PydanticBaseSettingsSource, ...]:
        global YAML_FILE_PATH

        return (
            EnvSettingsSource(
                settings_cls,
                env_prefix="APP__",
                env_nested_delimiter="__",
                case_sensitive=False,
            ),
            YamlConfigSettingsSource(settings_cls, yaml_file=YAML_FILE_PATH),
        )

YAML_FILE_PATH = sys.argv[1]
print(Settings().model_dump())
theelderbeever commented 6 months ago

@hramezani thanks for the response and thanks for putting it on the roadmap. In my current application I am able to get away with YamlConfigSettingsSource(settings_cls, yaml_file=os.getenv("APPLICATION_CONFIG_PATH", "config.yaml")) so I am going to run with that.

Thanks!

A-Telfer commented 5 months ago

For me I didn't want to use globals because I was planning to use it in a bunch of places. I just copied and pasted the BaseSettings code to add it in. To use it, it just becomes

from typing import Literal

from pydantic_settings import SettingsConfigDict

from .base_settings import BaseSettings # This is where I put the edited base settings

class Config(BaseSettings):
    version: Literal["0.0"] = "0.0"
    message: str

    model_config = SettingsConfigDict()

Config(_yaml_file="something.yaml")

Base Settings

from __future__ import annotations as _annotations

from pathlib import Path
from typing import Any, ClassVar

from pydantic._internal._utils import deep_update
from pydantic.main import BaseModel
from pydantic_settings import SettingsConfigDict
from pydantic_settings.sources import (ENV_FILE_SENTINEL, DotEnvSettingsSource,
                                       DotenvType, EnvSettingsSource,
                                       InitSettingsSource,
                                       JsonConfigSettingsSource, PathType,
                                       PydanticBaseSettingsSource,
                                       SecretsSettingsSource,
                                       YamlConfigSettingsSource)

class BaseSettings(BaseModel):
    """
    Base class for settings, allowing values to be overridden by environment variables.

    This is useful in production for secrets you do not wish to save in code, it plays nicely with docker(-compose),
    Heroku and any 12 factor app design.

    All the below attributes can be set via `model_config`.

    Args:
        _case_sensitive: Whether environment variables names should be read with case-sensitivity. Defaults to `None`.
        _env_prefix: Prefix for all environment variables. Defaults to `None`.
        _env_file: The env file(s) to load settings values from. Defaults to `Path('')`, which
            means that the value from `model_config['env_file']` should be used. You can also pass
            `None` to indicate that environment variables should not be loaded from an env file.
        _env_file_encoding: The env file encoding, e.g. `'latin-1'`. Defaults to `None`.
        _env_ignore_empty: Ignore environment variables where the value is an empty string. Default to `False`.
        _env_nested_delimiter: The nested env values delimiter. Defaults to `None`.
        _env_parse_none_str: The env string value that should be parsed (e.g. "null", "void", "None", etc.)
            into `None` type(None). Defaults to `None` type(None), which means no parsing should occur.
        _env_parse_enums: Parse enum field names to values. Defaults to `None.`, which means no parsing should occur.
        _secrets_dir: The secret files directory. Defaults to `None`.
    """

    def __init__(
        __pydantic_self__,
        _case_sensitive: bool | None = None,
        _env_prefix: str | None = None,
        _env_file: DotenvType | None = ENV_FILE_SENTINEL,
        _env_file_encoding: str | None = None,
        _env_ignore_empty: bool | None = None,
        _env_nested_delimiter: str | None = None,
        _env_parse_none_str: str | None = None,
        _env_parse_enums: bool | None = None,
        _secrets_dir: str | Path | None = None,
        _json_file: PathType | None = None,
        _json_file_encoding: str | None = None,
        _yaml_file: PathType | None = None,
        _yaml_file_encoding: str | None = None,
        **values: Any,
    ) -> None:
        # Uses something other than `self` the first arg to allow "self" as a settable attribute
        super().__init__(
            **__pydantic_self__._settings_build_values(
                values,
                _case_sensitive=_case_sensitive,
                _env_prefix=_env_prefix,
                _env_file=_env_file,
                _env_file_encoding=_env_file_encoding,
                _env_ignore_empty=_env_ignore_empty,
                _env_nested_delimiter=_env_nested_delimiter,
                _env_parse_none_str=_env_parse_none_str,
                _env_parse_enums=_env_parse_enums,
                _secrets_dir=_secrets_dir,
                _json_file=_json_file,
                _json_file_encoding=_json_file_encoding,
                _yaml_file=_yaml_file,
                _yaml_file_encoding=_yaml_file_encoding,
            )
        )

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        yaml_settings: PydanticBaseSettingsSource,
        json_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        """
        Define the sources and their order for loading the settings values.

        Args:
            settings_cls: The Settings class.
            init_settings: The `InitSettingsSource` instance.
            env_settings: The `EnvSettingsSource` instance.
            dotenv_settings: The `DotEnvSettingsSource` instance.
            file_secret_settings: The `SecretsSettingsSource` instance.

        Returns:
            A tuple containing the sources and their order for loading the settings values.
        """
        return init_settings, env_settings, dotenv_settings, file_secret_settings, yaml_settings

    def _settings_build_values(
        self,
        init_kwargs: dict[str, Any],
        _case_sensitive: bool | None = None,
        _env_prefix: str | None = None,
        _env_file: DotenvType | None = None,
        _env_file_encoding: str | None = None,
        _env_ignore_empty: bool | None = None,
        _env_nested_delimiter: str | None = None,
        _env_parse_none_str: str | None = None,
        _env_parse_enums: bool | None = None,
        _secrets_dir: str | Path | None = None,
        _json_file: PathType | None = None,
        _json_file_encoding: str | None = None,
        _yaml_file: PathType | None = None,
        _yaml_file_encoding: str | None = None,
    ) -> dict[str, Any]:
        # Determine settings config values
        case_sensitive = _case_sensitive if _case_sensitive is not None else self.model_config.get('case_sensitive')
        env_prefix = _env_prefix if _env_prefix is not None else self.model_config.get('env_prefix')
        env_file = _env_file if _env_file != ENV_FILE_SENTINEL else self.model_config.get('env_file')
        env_file_encoding = (
            _env_file_encoding if _env_file_encoding is not None else self.model_config.get('env_file_encoding')
        )
        env_ignore_empty = (
            _env_ignore_empty if _env_ignore_empty is not None else self.model_config.get('env_ignore_empty')
        )
        env_nested_delimiter = (
            _env_nested_delimiter
            if _env_nested_delimiter is not None
            else self.model_config.get('env_nested_delimiter')
        )
        env_parse_none_str = (
            _env_parse_none_str if _env_parse_none_str is not None else self.model_config.get('env_parse_none_str')
        )
        env_parse_enums = _env_parse_enums if _env_parse_enums is not None else self.model_config.get('env_parse_enums')
        secrets_dir = _secrets_dir if _secrets_dir is not None else self.model_config.get('secrets_dir')

        # Configure built-in sources
        init_settings = InitSettingsSource(self.__class__, init_kwargs=init_kwargs)
        env_settings = EnvSettingsSource(
            self.__class__,
            case_sensitive=case_sensitive,
            env_prefix=env_prefix,
            env_nested_delimiter=env_nested_delimiter,
            env_ignore_empty=env_ignore_empty,
            # env_parse_none_str=env_parse_none_str,
            # env_parse_enums=env_parse_enums,
        )
        dotenv_settings = DotEnvSettingsSource(
            self.__class__,
            env_file=env_file,
            env_file_encoding=env_file_encoding,
            case_sensitive=case_sensitive,
            env_prefix=env_prefix,
            env_nested_delimiter=env_nested_delimiter,
            env_ignore_empty=env_ignore_empty,
            # env_parse_none_str=env_parse_none_str,
            # env_parse_enums=env_parse_enums,
        )
        json_settings = JsonConfigSettingsSource(
            self.__class__,
            json_file=_json_file,
            json_file_encoding=_json_file_encoding,
            # case_sensitive=case_sensitive,
            # env_prefix=env_prefix,
        )
        yaml_settings = YamlConfigSettingsSource(
            self.__class__,
            yaml_file=_yaml_file,
            yaml_file_encoding=_yaml_file_encoding,
            # case_sensitive=case_sensitive,
            # env_prefix=env_prefix,
        )

        file_secret_settings = SecretsSettingsSource(
            self.__class__, secrets_dir=secrets_dir, case_sensitive=case_sensitive, env_prefix=env_prefix
        )
        # Provide a hook to set built-in sources priority and add / remove sources
        sources = self.settings_customise_sources(
            self.__class__,
            init_settings=init_settings,
            env_settings=env_settings,
            dotenv_settings=dotenv_settings,
            yaml_settings=yaml_settings,
            json_settings=json_settings,
            file_secret_settings=file_secret_settings,
        )
        if sources:
            return deep_update(*reversed([source() for source in sources]))
        else:
            # no one should mean to do this, but I think returning an empty dict is marginally preferable
            # to an informative error and much better than a confusing error
            return {}

    model_config: ClassVar[SettingsConfigDict] = SettingsConfigDict(
        extra='forbid',
        arbitrary_types_allowed=True,
        validate_default=True,
        case_sensitive=False,
        env_prefix='',
        env_file=None,
        env_file_encoding=None,
        env_ignore_empty=False,
        env_nested_delimiter=None,
        env_parse_none_str=None,
        env_parse_enums=None,
        json_file=None,
        json_file_encoding=None,
        yaml_file=None,
        yaml_file_encoding=None,
        toml_file=None,
        secrets_dir=None,
        protected_namespaces=('model_', 'settings_'),
    )
TmLev commented 2 months ago

@A-Telfer thank you for this workaround! Works well with JSON configs.

vlcinsky commented 1 month ago

Is adding _yaml_file argument really a breaking change?

The issue could be fixed by adding _yaml_file (and possibly _yaml_file_encoding) into BaseSettings.__init__ arguments.

@hramezani you mentioned in your https://github.com/pydantic/pydantic-settings/issues/259#issuecomment-2015085791 that this would be a breaking change.

Currently, using these parameters for instantiating any BaseSettings subclass breaks the code so there shall be no production code which is using these.

Adding one or two new arguments with default values should not break any existing code thus could be included in v2 updates.

I am aware some other classes might need modifications, but I think, these changes could be done in backward compatible way too.

Did I miss something?

alex-dr commented 1 month ago

I figured out a simpler workaround.

parser = argparse.ArgumentParser() 
parser.add_argument("--config_dir")

def get_config_dir(default=...):
    try:
        parsed = parser.parse_known_args()
    except SystemExit as exc:  # --help was passed
        parsed = None
    if parsed is not None and parsed[0].config_dir is not None:
        return parsed[0].config_dir
    return default

CONFIG_DIR = Path(get_config_dir())

class Settings(BaseSettings):
    ...
    model_config(yaml_file=CONFIG_DIR / "myconfig.yaml", ...)
    ...

config = Settings(_cli_settings_source=CliSettingsSource(Settings, root_parser=parser)(args=True), cli_parse_args=True)

You can use these CLI settings from any module where config is imported.

TmLev commented 1 month ago

Your approach uses global variables (CONFIG_DIR = Path(get_config_dir())) which is closer to "initialisation-time" than "runtime". It won't work for

def main():
    config_path = os.environ["CONFIG_PATH"]
    config = Settings(_source_path=config_path)

since class Settings has to be defined without referencing config_path.

alex-dr commented 1 month ago

True. I suppose you can just move the global to the class definition scope:

    model_config = SettingsConfigDict(
        yaml_file=get_config_dir() / “config.yaml”,
       …
    )

And move the cli settings into the model config as well

Then you can do

from project.config import Settings
cfg = Settings()

And you should have the —config-dir option.

I’ve also added env var loading to get_config_dir in my actual implementation.

It’s still not technically “runtime”, but I think it should work for most use cases…

TmLev commented 1 month ago

It's still "global" ("initialisation time"): yaml_file=get_config_path().

alex-dr commented 1 month ago

@TmLev Yes, but it still answers the OP’s request. A cli flag is added to specify an alternative path for loading any config files.

It improves on other “global” approaches because it adds the flag into the CLI, documents it in the —help output, and builds it into the class definition itself so that it’s portable.

It’s not ideal if you want to explicitly set the file at runtime through eg hard coded strings, as other commenters mentioned. Adding _yaml_file to init, while a nice improvement, does not itself provide a mechanism for configuring this through the CLI as requested.

EDIT: pasting the updated workaround here:

# FILE: config.py
parser = argparse.ArgumentParser() 
parser.add_argument("--config_dir")

def get_config_dir(default: Path = ...) -> Path:
    """This can be extended arbitrarily to load from an env var or anywhere else."""
    try:
        parsed = parser.parse_known_args()
    except SystemExit as exc:  # --help was passed
        parsed = None
    if parsed is not None and parsed[0].config_dir is not None:
        return Path(parsed[0].config_dir)
    return default

class Settings(BaseSettings):
    ...
    model_config(yaml_file=get_config_dir() / "myconfig.yaml", cli_parse_args=True, ...)
    ...
    @classmethod
    def settings_customise_sources(cls, settings_cls, ...):
        # deepcopy is required to create a unique argparse for each instance of Settings
        return (CliSettingsSource(settings_cls, root_parser=copy.deepcopy(parser))(args=True), ...)

# FILE: test.py
from .config import Settings

config = Settings()
cswrd commented 1 month ago

I am just migrating from dynaconf to pydantic-settings and stumpled upon this, immediately. I am aiming for toml based settings. Maybe I have a total misconception here, but why are specific file paths needed in the scope of "the model", at all? What's the reason for not using an inversion of control concept in order to pass the files at run-time? Shouldn't the model be generic in terms of which files are loaded in/validated against it?

edit: Found another workaround that does the job for me:

from pathlib import Path
from typing import ClassVar, Tuple, Type

from pydantic_settings import BaseSettings, PydanticBaseSettingsSource, SettingsConfigDict, TomlConfigSettingsSource

class MySettings(BaseSettings):

    foo: str = "default_foo"

    _toml_file: ClassVar[Path] = None  # Default value should also work.

    model_config = SettingsConfigDict(
        # toml_file=_toml_file  # You could use the above as default if set.
        env_prefix="FOO_",
        env_nested_delimiter="__",
        # etc. as usual.
    )

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: Type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> Tuple[PydanticBaseSettingsSource, ...]:
        sources = (init_settings, env_settings, dotenv_settings, file_secret_settings)

        if cls._toml_file:
            sources = sources + (TomlConfigSettingsSource(settings_cls, toml_file=cls._toml_file),)

        return sources

if __name__ == "__main__":
    MySettings._toml_file = Path("C:\\foo\\bar\\settings.toml")
    settings = MySettings()
alex-dr commented 1 month ago

That's not too bad, but you are literally setting the file path in the scope of the model definition - just doing so dynamically.

The proposed PR just adds the paths to the constructor, which is sort of the non-workaround version of your code.

The CLI workaround I mentioned lets you use the full CLI functionality of pydantic-settings, if that matters to you, while also supporting the --config-dir flag, which is fairly common for CLI tools to have. (It also makes it harder to turn off the CLI functionality when you don't want it.)

mpkocher commented 2 weeks ago

Here's a few possible ways to work around this friction point.

If you're only using one source, then perhaps avoid using pydantic-settings and add a utility function or class method to your model. This also avoids having an allowed "empty" constructor generated by pydantic-settings.

class Settings(BaseModel):
    alpha: int
    beta: int

    @classmethod
    def from_yaml(cls, path: Path) -> Self:
        with open(path, 'r') as yaml_file:
            dx = yaml.safe_load(yaml_file) or {}
        return cls(**dx)

Second, use a similar idea, but reuse the Source loader(s) from pydantic-settings. It possible to combine different sources in this model (e.g., merging dicts).

class Settings2(BaseSettings):
    alpha: int
    beta: int

    @classmethod
    def from_yaml(cls, path: Path) -> Self:
        return cls(**YamlConfigSettingsSource(cls, path)())

Lastly, use a closure to return the class with the correct configuration file path.


def to_settings3(yaml_file: Path):
    class Settings3(BaseSettings):
        model_config = SettingsConfigDict(yaml_file=yaml_file)
        alpha: int
        beta: int

        @classmethod
        def settings_customise_sources(
            cls,
            settings_cls: Type[BaseSettings],
            init_settings: PydanticBaseSettingsSource,
            env_settings: PydanticBaseSettingsSource,
            dotenv_settings: PydanticBaseSettingsSource,
            file_secret_settings: PydanticBaseSettingsSource,
        ) -> Tuple[PydanticBaseSettingsSource, ...]:
            return (YamlConfigSettingsSource(settings_cls),)

    return Settings3

Complete working examples:

https://gist.github.com/mpkocher/eb11b6807ef5e119b3e1ef5d7b629529