pydantic / pydantic-settings

Settings management using pydantic
https://docs.pydantic.dev/latest/usage/pydantic_settings/
MIT License
632 stars 66 forks source link

Support dash-separated CLI arguments and underscore-separated ENV variables #406

Closed andlogreg closed 1 month ago

andlogreg commented 1 month ago

When using multi-word named settings, I would like to:

  1. for CLI, use dashes for word separation, as it is the most common approach I've seen in CLI programs
  2. for ENV variables, use underscores for word separation as it is the standard (and dashes are not supported anyway)

Regarding 2., is given for "free" if you name the fields of your python models in snake case, as it is typically done.

For achieving 1., I tried something like this (see alias generator) which works:

# test_config.py
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from typing import List

class MySettings(BaseSettings):

    my_list: List[str] = Field(
        default_factory=list, description="A list argument"
    )

    my_int: int = Field(
        0,
        description="An int argument",
    )

    model_config = SettingsConfigDict(
        cli_parse_args=True,
        alias_generator=lambda x: x.replace("_","-")
    )

settings = MySettings()

print(settings)

And allows me to python test_config.py --my-list=a,b,c --my-int=45.

But this breaks the ENV variables functionality, since the names are also expected to have -, which is not supported afaik. I understand this happens because aliases are also used to define the expected ENV variable names.

Is there currently a way to achieve 1. and 2. simultaneously?

Maybe using different aliases for CLI arguments and ENV variables? Or using a different approach (ie, not aliases) to use - as the word separator in CLI?

hramezani commented 1 month ago

Thanks @andlogreg for this issue.

I think you can use different aliases for env and CLI. You can use AliasChoices and define multiple aliases for one field. one for ENV and one for CLI.

take a look at https://docs.pydantic.dev/latest/concepts/pydantic_settings/#literals-and-enums

andlogreg commented 1 month ago

Thanks @andlogreg for this issue.

I think you can use different aliases for env and CLI. You can use AliasChoices and define multiple aliases for one field. one for ENV and one for CLI.

take a look at https://docs.pydantic.dev/latest/concepts/pydantic_settings/#literals-and-enums

Awesome! Indeed this seems to do the trick:

(...)
class MySettings(BaseSettings):

    my_list: List[str] = Field(
        default_factory=list, description="A list argument",
        alias=AliasChoices("my_list", "my-list"),
    )

    my_int: int = Field(
        0,
        description="An int argument",
        alias=AliasChoices("my_int", "my-int"),
    )

    model_config = SettingsConfigDict(
        cli_parse_args=True,
    )

Which works:

# MY_INT=45 MY_LIST='["a","b","c"]' python test_config.py
my_list=['a', 'b', 'c'] my_int=45

# python test_config.py --my-list=a,b,c --my-int=45
my_list=['a', 'b', 'c'] my_int=45

I guess the cherry on top would be for alias_generator to support returning AliasChoices which currently is not supported:

(...)

File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py", line 224, in __new__
    complete_model_class(
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py", line 573, in complete_model_class
    schema = cls.__get_pydantic_core_schema__(cls, handler)
  File "/usr/local/lib/python3.10/site-packages/pydantic/main.py", line 668, in __get_pydantic_core_schema__
    return handler(source)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_schema_generation_shared.py", line 83, in __call__
    schema = self._handler(source_type)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 655, in generate_schema
    schema = self._generate_schema_inner(obj)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 924, in _generate_schema_inner
    return self._model_schema(obj)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 739, in _model_schema
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 739, in <dictcomp>
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 1115, in _generate_md_field_schema
    common_field = self._common_field_schema(name, field_info, decorators)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 1353, in _common_field_schema
    self._apply_alias_generator_to_field_info(alias_generator, field_info, name)
  File "/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py", line 1174, in _apply_alias_generator_to_field_info
    raise TypeError(f'alias_generator {alias_generator} must return str, not {alias.__class__}')
TypeError: alias_generator <function MySettings.<lambda> at 0x7ffffe51f370> must return str, not <class 'pydantic.aliases.AliasChoices'>

to avoid repetition for multiple settings. Do you think this is a feasible feature @hramezani ?


As a side note: An alternative approach could also be for the option populate_by_name to also be taken into consideration for CLI and ENV naming, which it doesn't seem to be currently the case

hramezani commented 1 month ago

I guess the cherry on top would be for alias_generator to support returning AliasChoices which currently is not supported:

This is something that has to be changed in pydantic not here.

As a side note: An alternative approach could also be for the option populate_by_name to also be taken into consideration for CLI and ENV naming, which it doesn't seem to be currently the case

Yes, populate_by_name is not working on pydantic-settings

andlogreg commented 1 month ago

After opening the issue in pydantic and while starting working on a PR... I learned that this use-case is in fact already covered by using the AliasGenerator class and specifying a validation alias.

Here's a full example:

from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field, AliasChoices, BaseModel, AliasGenerator, ConfigDict
from typing import List

class SubModel(BaseModel):
    sub_list: List[str] = Field(
        default_factory=list, description="A list argument",
    )
    sub_int: int = Field(
        0,
        description="An int argument",
    )

    model_config = ConfigDict(
        alias_generator=AliasGenerator(validation_alias=lambda x: AliasChoices(x, x.replace('_','-')))
    )

class MySettings(BaseSettings):
    sub_model: SubModel = Field(
        SubModel(),
    )

    my_list: List[str] = Field(
        default_factory=list, description="A list argument",
    )

    my_int: int = Field(
        0,
        description="An int argument",
    )

    model_config = SettingsConfigDict(
        cli_parse_args=True,
        env_nested_delimiter="__",
        alias_generator=AliasGenerator(validation_alias=lambda x: AliasChoices(x, x.replace('_','-')))
    )

settings = MySettings()

print(settings)

Using CLI argument with dashes and no underscores:

# python test_config.py --my-list=a,b,c --my-int=12 --sub-model.sub-list=d,e,f --sub-model.sub-int=34
sub_model=SubModel(sub_list=['d', 'e', 'f'], sub_int=34) my_list=['a', 'b', 'c'] my_int=12

Using ENV variables:

# MY_INT=56 MY_LIST='["g","h","i"]' SUB_MODEL__SUB_INT=78 SUB_MODEL__SUB_LIST='["j","k"]' python test_config.py
sub_model=SubModel(sub_list=['j', 'k'], sub_int=78) my_list=['g', 'h', 'i'] my_int=56

So I think this issue can be closed. Thank you for the support @hramezani !