openforcefield / openff-models

Helper classes for Pydantic compatibility in the OpenFF stack
MIT License
3 stars 2 forks source link

Why not use openff.units unit's in the BaseModel #6

Open richardjgowers opened 2 years ago

richardjgowers commented 2 years ago

There's probably a good technical reason, but is there a reason we can't hack either (this package's) BaseModel or openff-units to allow this to work:

In [12]: class Thing(BaseModel):
    ...:     temp: units.unit.kelvin
    ...:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/validators.py:751, in pydantic.validators.find_validators()

TypeError: issubclass() arg 1 must be a class

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
Cell In [12], line 1
----> 1 class Thing(BaseModel):
      2     temp: units.unit.kelvin

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/main.py:198, in pydantic.main.ModelMetaclass.__new__()

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/fields.py:506, in pydantic.fields.ModelField.infer()

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/fields.py:436, in pydantic.fields.ModelField.__init__()

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/fields.py:557, in pydantic.fields.ModelField.prepare()

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/fields.py:831, in pydantic.fields.ModelField.populate_validators()

File ~/miniconda3/envs/openfe/lib/python3.10/site-packages/pydantic/validators.py:760, in find_validators()

RuntimeError: error checking inheritance of <Unit('kelvin')> (type: Unit)
mattwthompson commented 2 years ago

This doesn't provide a BaseModel and I hadn't thought of creating a subclass of pydantic.BaseModel that shares the same name - that's why DefaultModel was created (not tied to the name per se). Name clashes like this always feel like a landmine to me - I can be talked out of this, though

mikemhenry commented 2 years ago

Sorry the title of this PR is a bit confusing, the main point is could we make this work?

class Atom(DefaultModel):
    mass: unit.amu

instead of

class Atom(DefaultModel):
    mass: FloatQuantity["atomic_mass_constant"]
richardjgowers commented 2 years ago

Yeah sorry, confusing title but Mike has it. It'd be really nice if I didn't have to learn a new system (FloatQuantity['whatever']) to do models and could just write in terms of unit.whatever. If you're writing both DefaultModel and unit then surely there's a way....

mattwthompson commented 2 years ago

Okay, I see what you're asking. I'll have to think about if that's possible, AFAIR the typing infrastructure is still janky with non-primitives. It does work out of the box if you just need to specify that the quantity is a Quantity and don't need to enforce the specific unit or dimensionality:

In [1]: from openff.models.models import DefaultModel

In [2]: from openff.units import unit

In [3]: import json

In [4]: class Thing(DefaultModel):
   ...:     temperature: unit.Quantity
   ...:

In [5]: thing = Thing(temperature=unit.Quantity(400, unit.kelvin))

In [6]: thing.dict()
Out[6]: {'temperature': 400 <Unit('kelvin')>}

In [7]: thing.json()
Out[7]: '{"temperature": "{\\"val\\": 400, \\"unit\\": \\"kelvin\\"}"}'

In [8]: json.loads(thing.json())
Out[8]: {'temperature': '{"val": 400, "unit": "kelvin"}'}

In [9]: Thing.parse_raw(thing.json())
Out[9]: Thing(temperature=<Quantity(400, 'kelvin')>)

In [10]: Thing(temperature=unit.Quantity(400, unit.angstrom))
Out[10]: Thing(temperature=<Quantity(400, 'angstrom')>)

but that's not exactly the scope of your question.

lilyminium commented 2 years ago

I played around with this once by just throwing in a check that the unit was convertible. It probably winds up quite slow, though.

def validate_unit(value, field_unit, field_name):
    from openff.units import unit
    import numpy as np
    from pint import DimensionalityError

    if isinstance(value, str):
        value = unit.Quantity(value)
    elif isinstance(value, (list, tuple)):
        value = unit.Quantity.from_tuple(value)
    elif isinstance(value, (float, int, np.ndarray)):
        warnings.warn(
            f"{field_name} value not provided with units. "
            f"Setting to default units {field_unit}."
        )
        value = value * field_unit
    elif not isinstance(value, unit.Quantity):
        from openff.units.openmm import from_openmm
        value = from_openmm(value)

    try:
        value.m_as(field_unit)
    except DimensionalityError:
        raise TypeError(
            f"{field_name} units incompatible with {field_unit}"
        )
    return value

class UnitsContainerMixin(Model):
    units: ClassVar[Dict[str, "unit.Quantity"]] = {}

    @validator("*", pre=True)
    def _validate_unit_type(cls, v, field):

        if v is None:
            return v
        if field.name in cls.units:
            v = validate_unit(v, cls.units[field.name], field.name)

        return v

To avoid having a units dict you could maybe iterate through the field during the validation check, or set a class property during class initialisation with a metaclass maybe.

mattwthompson commented 6 months ago

I think there are two issues here, one above which is also captured in #2 and another implied by the title which won't work because Pint Quantitys are not viewed as valid types by type checkers