Open tjgalvin opened 1 week ago
Maybe something like this
class BaseOptions:
def test_method(self, *args, **kwargs):
print(f"{self=} {args=}" )
def with_options(self, *args, **kwargs):
assert len(args) == 0, "Positions args are not allowed"
assert all([k in self.__dict__ for k in kwargs])
self.__dict__.update(kwargs)
return type(self)(**self.__dict__)
@dataclass(frozen=True)
class Options(BaseOptions):
a: int
a = Options(a=11)
a.test_method('a,', 'b', 2345)
a.with_options(a="something completely different")
a.test_method('a,', 'b', 2345)
a.a = "ERROR" # this will error out
Doing something similar with pydantic
from pydantic import BaseModel
from typing import Union
class OptionsModel(BaseModel):
model_config = dict(frozen= True)
def with_options(self, *args, **kwargs):
assert len(args) == 0, "Positions args are not allowed"
assert all([k in self.__dict__ for k in kwargs])
copy_dict = self.__dict__.copy()
copy_dict.update(kwargs)
return self.__class__(**copy_dict)
class TestOptions(OptionsModel):
a: Union[int, float]
b: float = 1.23
aa = TestOptions(a=1.234)
print(aa)
bb = aa.with_options(a=4)
bb.a = 1 # error
I think I like this approach a little more than dataclasses. Although pydantic is not in the stdlib
, it is already a dependency through prefect. The neater thing with this approach is that we can set the frozen=True
property to the model_config
property of the base model class, which is brought forward to the itels we subclass.
The additional validation and casting it offers based on the types is also neat.
Next will look at how well either approach integrates with:
So continuing down the pydantic avenue, here is a neat-ish way to build upon an existing argument parser, add arguments drawn from the model, and recreate the model
from argparse import ArgumentParser
from pathlib import Path
from pydantic import BaseModel, ConfigDict
class BaseOptions(BaseModel):
model_config = ConfigDict(frozen=True, use_attribute_docstrings=True)
class WSCleanOptions(BaseOptions):
ms: Path
"""The is the path to the measurement set"""
imsize: int = 6000
"""The size of an image"""
make_big: bool = False
"""Make the image larger"""
def add_pydantic_model_to_parser(parser: ArgumentParser, options_class) -> ArgumentParser:
for name, field in options_class.model_fields.items():
field_name = name.replace('_', '-')
field_name = f'--{field_name}' if not field.is_required() else field_name
field_default = field.default
action = 'store'
if field.annotation is bool:
action = 'store_false' if field.default else 'store_true'
parser.add_argument(
field_name, help=field.description, action=action, default=field_default
)
return parser
def create_options_from_parser(parser_namespace, options_to_init):
args = vars(parser_namespace) if not isinstance(parser_namespace, dict) else parser_namespace
opts_dict = {}
for name, field in options_to_init.model_fields.items():
opts_dict[name] = args[name]
return options_to_init(**opts_dict)
if __name__ == "__main__":
parser = ArgumentParser(description="Example CLI with a pydantic model")
parser.add_argument("--something-else", default=123, type=float, help="Unrelated to options")
parser = add_pydantic_model_to_parser(parser=parser, options_class=WSCleanOptions)
args = parser.parse_args()
print(args)
b = create_options_from_parser(parser_namespace=args, options_to_init=WSCleanOptions)
print(b)
Running it produces the following with something like python test_pydatnic.py example --make-big
Namespace(something_else=123, ms='example', imsize=6000, make_big=True)
ms=PosixPath('example') imsize=6000 make_big=True
I am not really sure how hacky this is. My use case is intended to be to define other arguments, extract whatever options are required for some aribtary model, and bam. If the model is updated so is the CLI.
Throughout flint we have been using
NamedTuples
to as a basis to create immutable structures that are used as interfaces between tasks and to provide some level of first order validation. Throughout the code these are referred to asOptions
, e.g.WSCleanOptions
,GainCalOptions
etc.Though this has been very nice and really helped to force some careful thinking when changing their state, there as some use cases where they are a little limiting and there might be nicer alternatives.
1 - Often the
Options
should be exposed to the CLI so that a user / tester can supply options for testing or bespoke operations. In the current form these options would have to manually be added to theparser.add_argument
. There are modules out there that can operate onDataclasses
or pydantic models.2 - Inability to add methods to the
NamedTuple
which we use to sub-class from. SinceOptions
is immutable by design a.with_options
method is often attached to each of theOptions
classes that provides a way of updating specific attributes. So far I have not found a nice / consistent way of being able to attach additional method like.with_options
to theNamedTuple
class so we don't have to keep repeating the same method.I am hoping to consider using something from the standard library, so am looking towards dataclasses. These do have a a
frozen
andkw_only
arguments, which allows the output class to be immutable and init'd via keyword arguments only. There are some dataclass to argparse modules as well that might make life easier.Does anyone have thoughts on this?