pydantic / pydantic-settings

Settings management using pydantic
https://docs.pydantic.dev/latest/usage/pydantic_settings/
MIT License
640 stars 66 forks source link

is there a way to mark fields as mutually_exclusive_group? #467

Closed braindevices closed 3 days ago

braindevices commented 1 week ago

the argparser allow add_mutually_exclusive_group, is there similar feature in pydantic settings?

hramezani commented 1 week ago

@kschwab Could you please take a look here?

kschwab commented 1 week ago

No, currently CLiSettingsSource does not provide a way to mark mutually exclusive groups. I would suggest using a validator instead, e.g.:

from typing import Optional
from typing_extensions import Self
from pydantic import BaseModel, model_validator

class MyModel(BaseModel):
    a: Optional[str] = None
    b: Optional[str] = None

    @model_validator(mode='after')
    def check_a_b_exclusion(self) -> Self:
        if self.a is not None and self.b is not None:
            raise ValueError("'a' and 'b' are mutually exclusive.")
        return self

MyModel()
MyModel(a='hi')
MyModel(b='hi')
MyModel(a='hi', b='hi') # raises validation error

Perhaps we can find a way that simplifies the above for convenience, i.e. provide a pre-canned validator for mutual exclusion.

braindevices commented 1 week ago

thanks, currently I am using validator, but it does not provide the relationship in help message, I basically have to manually add them in the field description. It would be nice if there is kind of annotation we can use to mark the group, so it can automatically validate and also shows in the help message.

braindevices commented 1 week ago

Most of the mutually exclusive flags can be easily replaced by choice options, for example (-a | -b | -c) can just be replace by --choice_abc {a,b,c} and won't cause any inconvenience.

For options, since we can have subcommand as union I get inspired by using Union to handle the mutually exclusive group more naturally.

The simple type like, int, str, bool, etc, automatically working already, then we can basically determine what to do based on the type. For example, if we want to identify an unique item whose datetime, id and name are all unique, then the options are (--id ID | --datetime DATETIME | --name NAME), specify one of them is fine, specify more than one will cause confusion or conflict. Then this can simply be implemented via id_datetime_or_name: Union[int, AwareDatetime, str] = Field(union_mode='left_to_right'), and won't be much difference for the users. say --id_datetime_or_name=2032-04-23T10:20:30.400+02:30 comparing to --datetime=2032-04-23T10:20:30.400+02:30

However, more tricky thing is the mutually exclusive options that has same type, this happens when we can describe something in different ways, and any path results in the same final result. For example, we want to describe the size of a circle then we allow radius, diameter, perimeter, it only make sense to specify one of them. (--radius RADIUS | --diameter DIAMETER | --perimeter PERIMETER)

So far in pydantic we have to use 2 options to do this: [--circle_measure_type {radius,diameter,perimeter}] [--circle_measure float], which apparently worse than the mutually exclusive group in some cases, although it is still acceptable.

A more complicated case is the mutually exclusive item is actually a group of options, which actually is not supported in argparse:

Changed in version 3.11: Calling add_argument_group() or add_mutually_exclusive_group() on a mutually exclusive group is deprecated. These features were never supported and do not always work correctly. The functions exist on the API by accident through inheritance and will be removed in the future.

If we use union in this case, it does not give desired help output and it also does not give out error when we specify the conflict options.

for example:

union2: Union[Uri, Socket] = Field(
        # default=Uri(host="localhost", address_type="uri"),
        union_mode='left_to_right'
    )

It gives:

[--union2 JSON] [--union2.port int] [--union2.host str] [--union2.path str]
union2 options:
  --union2 JSON         set union2 from JSON string
  --union2.port int     (default: 80)
  --union2.host str     (required)
  --union2.path str     (required)

We would expect something like

([--union2 JSON] | ( [--union2.port int] [--union2.host str]) | [--union2.path str])

and if we specify the conflict options like "--union2.host=localhost --union2.path=test it should fail. So far I can only achieve this by model_config = ConfigDict(extra="forbid") in each of the model in the Union, extra="forbid" for Settings is not enough so far.

example help output of argparse:

usage: example.py [-h] (--radius RADIUS | --diameter DIAMETER | --perimeter PERIMETER) (--id ID | --datetime DATETIME | --name NAME) (-a | -b | -c)

Calculate circle properties.

options:
  -h, --help            show this help message and exit

Circle Measurements:
  Choose one of the following measurements:

  --radius RADIUS       Radius of the circle
  --diameter DIAMETER   Diameter of the circle
  --perimeter PERIMETER
                        Perimeter (circumference) of the circle

item identifier:
  id/datetime/name:

  --id ID
  --datetime DATETIME
  --name NAME

flags:
  Choose one of the following flags:

  -a
  -b
  -c

example help output of pydantic setting:

usage: example.py [-h] [--circle_measure_type {radius,diameter,perimeter}] [--circle_measure float] [--choice_abc {a,b,c}] [--id_datetime_or_name {int,AwareDatetime,str}] [--union2 JSON]
                  [--union2.port int] [--union2.host str] [--union2.path str]

options:
  -h, --help            show this help message and exit
  --circle_measure_type {radius,diameter,perimeter}
                        (default: radius)
  --circle_measure float
                        (default: 0.0)
  --choice_abc {a,b,c}  (default: a)
  --id_datetime_or_name {int,AwareDatetime,str}
                        (default: 80)

union2 options:
  --union2 JSON         set union2 from JSON string
  --union2.port int     (default: 80)
  --union2.host str     (required)
  --union2.path str     (required)

example code:

import argparse
from datetime import datetime
from typing import Any, Callable, List, Literal, Tuple, Union
from pydantic import AliasGenerator, BaseModel, Field, AwareDatetime, ConfigDict
from pydantic_settings import BaseSettings, SettingsConfigDict
import sys

class Uri(BaseModel):
    model_config = ConfigDict(extra="forbid")
    # address_type: Literal['uri']
    port: int = 80
    host: str

class Socket(BaseModel):
    model_config = ConfigDict(extra="forbid")
    # address_type: Literal['socket']
    path: str

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        cli_parse_args=True,
        nested_model_default_partial_update=True,
        extra="forbid"
        # alias_generator=AliasGenerator(lambda s: s.replace('_', '-'))
    )
    circle_measure_type: Literal["radius", "diameter", "perimeter"] = "radius"
    circle_measure: float = 0.0
    choice_abc:  Literal['a', 'b', 'c'] = "a"
    id_datetime_or_name: Union[int, AwareDatetime, str] = Field(union_mode='left_to_right', default=80)
    union2: Union[Uri, Socket] = Field(
        # default=Uri(host="localhost", address_type="uri"),
        union_mode='left_to_right'
    )

def parse_args():
    parser = argparse.ArgumentParser(description="Calculate circle properties.")

    # Create a mutually exclusive group for --radius, --diameter, and --perimeter
    group = parser.add_argument_group("Circle Measurements", "Choose one of the following measurements:")
    me_group = group.add_mutually_exclusive_group(required=True)
    me_group.add_argument("--radius", type=float, help="Radius of the circle")
    me_group.add_argument("--diameter", type=float, help="Diameter of the circle")
    me_group.add_argument("--perimeter", type=float, help="Perimeter (circumference) of the circle")

    def parse_datetime(date_string):
        return datetime.fromisoformat(date_string)

    group = parser.add_argument_group("item identifier", "id/datetime/name:")
    me_group = group.add_mutually_exclusive_group(required=True)
    me_group.add_argument("--id", type=int)
    me_group.add_argument("--datetime", type=parse_datetime)
    me_group.add_argument("--name", type=str)

    group = parser.add_argument_group("flags", "Choose one of the following flags:")
    me_group = group.add_mutually_exclusive_group(required=True)
    me_group.add_argument("-a", action="store_true")
    me_group.add_argument("-b", action="store_true")
    me_group.add_argument("-c", action="store_true")

    return parser.parse_args()

def help(func: Callable[[], Any]):
    sys.argv = [
        'example.py',
        "-h"
    ]
    try:
        func()
    except SystemExit:
        print("sys.exit() was called and handled.")
    print("After exit")

def test_cli():
    sys.argv = [
        'setting1',
        "--circle_measure_type=diameter",
        "--circle_measure=100.0",
        "--choice_abc=b",
        "--id_datetime_or_name=2032-04-23T10:20:30.400+02:30",
        "--union2.host", "localhost",
        # "--union2.path", "/tmp/test"
    ]
    a = Settings()
    print("in test cli: Settings()=", a)

    sys.argv = [
        'setting2',
        "--id_datetime_or_name=item-name",
        "--union2.path", "/tmp/test"
    ]
    a = Settings()
    print("in test cli: Settings()=", a)

    sys.argv = [
        'setting3-should-fail',
        "--union2.host=localhost",
        "--union2.path=/tmp/test"
    ]
    a = Settings()
    print("in test cli: Settings()=", a)

def main():
    help(parse_args)
    help(Settings)
    test_cli()

if __name__ == "__main__":
    main()
braindevices commented 1 week ago

the expected output I wanted for better Union field support:

usage: example.py [-h] [--circle_measure_type {radius,diameter,perimeter}] [--circle_measure float] [--choice_abc {a,b,c}] [--id_datetime_or_name {int,AwareDatetime,str}] ([--union2 JSON] | ([--union2.port int] [--union2.host str] ) |  [--union2.path str])

options:
  -h, --help            show this help message and exit
  --circle_measure_type {radius,diameter,perimeter}
                        (default: radius)
  --circle_measure float
                        (default: 0.0)
  --choice_abc {a,b,c}  (default: a)
  --id_datetime_or_name {int,AwareDatetime,str}
                        (default: 80)

union2 options:
  Choose one of the following group:
    group JSON:
      --union2 JSON         set union2 from JSON string
    group Uri:
     --union2.port int     (default: 80)
     --union2.host str     (required)
   group Socket:
     --union2.path str     (required)
braindevices commented 1 week ago

if we allow some kind of marking we can do:

class Settings(BaseSettings):
    radius: CliMutuallyExclusiveGroup[float, "Circle"]
    diameter: CliMutuallyExclusiveGroup[float, "Circle"]
    perimeter: CliMutuallyExclusiveGroup[float, "Circle"]

Then the help will be:

usage: example.py [-h] (--radius float | --diameter float | --perimeter float)

Circle (required, mutually exclusive):
  --radius float
  --diameter float
  --perimeter float
kschwab commented 1 week ago

@braindevices I agree the above outcome would be ideal. I think it's a little clunkier in practice, where we would need to use Annotated and Optional, e.g.:

class Settings(BaseSettings):
    radius: Annotated[Optional[float], CliMutuallyExclusiveGroup("Circle")] = None
    diameter: Annotated[Optional[float], CliMutuallyExclusiveGroup("Circle")] = None
    perimeter: Annotated[Optional[float], CliMutuallyExclusiveGroup("Circle")] = None

However, I think providing CliMutuallyExclusiveGroup as an inheritable class would be a good fit. Then we could simply throw mutually exclusive items together in a class and everything would fall out rather naturally, e.g.:

class Circle(CliMutuallyExclusiveGroup):
    radius: Optional[float] = None
    diameter: Optional[float] = None
    perimeter: Optional[float] = None

class Settings(BaseSettings):
    circle: Circle

The above is nice from an implementation point of view as it keeps things simple, but also allows for more flexibility. e.g., circle could be a mutually exclusive group that is required or optional, better help strings, etc. I'm inclined to head in that direction, what are your thoughts?

braindevices commented 1 week ago

Yes I actually prefer your way. Thanks!

hramezani commented 1 week ago

@kschwab @braindevices Thanks both this issue.

@kschwab do we need any action here? should we keep the issue open?

kschwab commented 1 week ago

@hramezani yes, we can keep it open. I will open PR in next few days to resolve.