Deltares / imod-python

🐍🧰 Make massive MODFLOW models
https://deltares.github.io/imod-python/
MIT License
20 stars 3 forks source link

"User friendly" enums #1181

Open Huite opened 2 months ago

Huite commented 2 months ago

Basically a follow up of #416:

We've briefly discussed this before: Enums are a great way to enumerate options, but a lot of our users aren't familiar with them. The issue with enums is also that you need to import the relevant enums from the right namespace. A pragmatic solution is to dynamically force inputs to enums, thereby checking them as well.

This gives decent errors:

from enum import Enum

class Color(Enum):
    RED = 0
    GREEN = 1
    BLUE = 2

color = Color("YELLOW")

ValueError: 'YELLOW' is not a valid Color

Ideally, it wouldn't tell you that it's wrong, but what the right entries are. This is especially helpful with typos.

from enum import Enum
from typing import Union, Type, TypeVar

E = TypeVar('E', bound='FlexibleEnum')

def _show_options(options: Enum) -> str:
    return "\n * ".join(map(str, options.__members__))

class FlexibleEnum(Enum):
    @classmethod
    def from_value(cls: Type[E], value: Union[E, str]) -> E:
        if isinstance(value, cls):
            return value
        elif isinstance(value, str):
            try:
                return cls.__members__[value]
            except KeyError:
                pass

        raise ValueError(
            # Use __repr__() so strings are shown with quotes.
            f"{value.__repr__()} is not a valid {cls.__name__}. "
            f"Valid options are:\n * {_show_options(cls)}"
        )

class Color(FlexibleEnum):
    RED = 1
    GREEN = 2
    BLUE = 3

Color.from_value("YELLOW")

Color.from_value("YELLOW")
ValueError: 'YELLOW' is not a valid Color. Valid options are:
 * RED
 * GREEN
 * BLUE

Another advantage is that regular enums accept integer values.

E.g. one of the current enums:

class ALLOCATION_OPTION(Enum):
    stage_to_riv_bot = 0
    first_active_to_elevation = -1
    stage_to_riv_bot_drn_above = 1
    at_elevation = 2
    at_first_active = 9  # Not an iMOD 5.6 option

In general, I don't think we want user facing functions to support something like .allocate(option=0). A default Enum will support ALLOCATION_OPTION(0).

But with the FlexibleEnum.from_value(), it won't (which is good):

Color.from_value(1)

ValueError: 1 is not a valid Color. Valid options are:
 * RED
 * GREEN
 * BLUE

So my suggestion is to replace the Enums with these FlexibleEnums (or a better name), and preferable the same for all strings literals that we support. Then inside of the function:

def f(arg, option: str | ThisOptionEnum):
     option = ThisOptionEnum.from_value(option)
     ...

This ensures the option is validated and that a clear error message listing the available valid options is printed.

Manangka commented 2 months ago

I like the idea. Some of my thoughts

In the first code snippet you define the Color class as class Color(Enum). You assign a color as: color = Color("YELLOW") which throws an exception. However a valid value also throws an exception: color = Color("RED")

What i gather from the python documentation calling the class like that is called the function-call syntax for creating an enum. Its an alternative way to the class defintition you use. It isn't used to create a specific enum member https://docs.python.org/3/library/enum.html


Preferably the new Enum should be a drop in replacement of the original enum. It would be nice that when someone uses it like Color.Yellow it would give your error message.


I do like your from_value method as well. Its similar to C# Enum.Parse method to create an Enum from a string. I think i would it a bit differently in the code: option = option if isinstance(option, Color) else Color.Parse(option)

Huite commented 2 months ago

You're right: I got my first example mixed up, what I should/could have written is:

class Color(Enum):
    RED = "RED"
    GREEN = "GREEN"
    BLUE = "BLUE"

This does accept Color("RED"), but the duplication feels rather silly here.

This works:

class Color(Enum):
    RED = 0
    GREEN = 1
    BLUE = 2

Color(0)  # -> Color.RED

But is kind of anti-feature for our usage.

It would be nice that when someone uses it like Color.Yellow it would give your error message.

This is doable, although not that easy. It needs a metaclass.

Adding a parse method is a good idea.

Here's basic implementation:

from enum import Enum, EnumMeta
from typing import Any, Type, TypeVar, Union

E = TypeVar("E", bound="FlexibleEnum")

def _show_options(options: Type[E]) -> str:
    return "\n * ".join(map(str, options.__members__))

class AttributeErrorMeta(EnumMeta):
    def __getattr__(cls, name: str) -> Any:
        try:
            return cls.__members__[name]
        except KeyError:
            raise AttributeError(
                f"{name} is not a valid {cls.__name__}. "
                f"Valid options are:\n * {_show_options(cls)}"
            )

class FlexibleEnum(Enum, metaclass=AttributeErrorMeta):
    @classmethod
    def parse(cls: Type[E], value: str) -> E:
        try:
            return cls.__members__[value]
        except KeyError:
            raise ValueError(
                # Use __repr__() so strings are shown with quotes.
                f"{value.__repr__()} is not a valid {cls.__name__}. "
                f"Valid options are:\n * {_show_options(cls)}"
            )
Manangka commented 2 months ago

Nice!

I just tested this:

# Existing value
Color.RED
<Color.RED: 0>

# Non-existing value
Color.YELLOW
Traceback (most recent call last):
KeyError: 'YELLOW'

During handling of the above exception, another exception occurred:

AttributeError: YELLOW is not a valid Color. Valid options are:
 * RED
 * GREEN
 * BLUE

Which gives a clear error message.

I think it is fine that a metaclass is needed to get this output. Its a small addition to a class that wont change much. It is also something that can be well tested using unit tests

Huite commented 2 months ago

FYI, I've included an implementation in pandamesh here:

https://github.com/Deltares/pandamesh/blob/main/pandamesh/enum_base.py

I've maintained the from_value method there because it saves on a lot of duplication in the places where it's used.

Huite commented 2 months ago

I was also struggling with rendering the enums docs in an appropriate manner, I needed a custom template in the end:

https://github.com/Deltares/pandamesh/blob/main/docs/_templates/enums.rst

This is included in the index:

.. autosummary::
   :toctree: api/
   :template: enums.rst

    DelaunayAlgorithm