canonical / ops-scenario

State-transition testing SDK for Operator Framework Juju charms.
Apache License 2.0
10 stars 7 forks source link

feat!: use sets for the state components #134

Closed tonyandrewmeyer closed 2 months ago

tonyandrewmeyer commented 3 months ago

The primary change is that instead of lists for the various State components (containers, relations, etc), frozensets are used.

To provide access to the components, and to avoid people needing to write many rel = any(r for r in state_out.relations if r.id == id) type statements, there are several new methods on State:

get_resource and get_opened_port are not added, because the charm cannot add or change resources, and ports don't have a unique identifier (it's more likely that tests want to check whether a Port is contained in the set).

There are two straight renames that change from singular to plural, since the state holds 0-n:

Several state components gain a custom __hash__ method, so that they can be put into sets/frozensets. This also has the side-effect of removing the need to have the consistency checker check for duplicates, because a set/frozenset can't have a duplicate. It does mean that you can pass a set that has duplicates and not get a warning (Python just discards one item), but a linter should detect that issue, as with regular set use.

The PR also introduces a Resource class to hold resources, so that they are consistent with all of the other components of the state.

PietroPasotti commented 3 months ago

very nice! wondering if the get_x methods should (also) take instances as arguments

tonyandrewmeyer commented 3 months ago

very nice! wondering if the get_x methods should (also) take instances as arguments

Yeah, it seems wrong that get_container does (because it already did in 6) and the others do not. It does seem like consistency would be best, and there doesn't seem to be a reason to remove it from get_container, so adding to the others would probably be cleanest.

tonyandrewmeyer commented 3 months ago

One question I have: is it weird that users will be putting regular sets with {...} syntax in, but get frozensets out? Should we just use regular old sets everywhere, even though they're "more mutable"?

I did wonder about this. I strongly believe that frozensets are the most appropriate data type - but it's super inconvenient to use them for the input since there's no literal syntax.

It is a bit magic / inconsistent - but I think less so than if you were putting in an entirely different type. It's just "freezing" the content you put in, so I felt like that was ok.

I do actually like that you can provide any iterable (of hashable objects) and you'll get the expected state object - so you can do State(containers=(cont1, cont2)) or State(containers={cont1, cont2}) and you'll end up with the same .containers == frozenset({cont1, cont2}). This feels more Pythonic to me - duck typing rather than strict typing - even though it's not the typical dataclass result.

I think it's also beneficial to increase the immutability of State, since we're expecting charmers to treat it as completely immutable. If it's possible to state.containers.add(cont3) then people will, leading to unexpected behaviour.

benhoyt commented 3 months ago

Yep, I agree with that reasoning -- duck/loose typing accepted as input, strict/best type used as output.

tonyandrewmeyer commented 3 months ago

The 'get' signatures have ended up more mixed than they were previously.

Before:

Now:

@benhoyt @PietroPasotti would you still go with this? Having a bunch of "which set of arguments were passed and do they all agree with each other" code is a smell imo. I think it works ok for Container and Relation but less so for StoredState and Storage (and Secret is odd either way, because of the need to get by label).

PietroPasotti commented 3 months ago

Is it an idea to use singledispatch? That would clean up the signatures imho, at the cost of more lines of code.

tonyandrewmeyer commented 3 months ago

TIL about functools.simpledispatch, thanks! I guess it's basically sugar for a bunch of if isinstance(arg1, x) calls, and presumably makes the type signatures nicely too. I don't mind the extra verbosity, and it probably would indeed simplify the cases where you're passing the wrong combination of args (I assume apart from the static checking, it actually fails at runtime if you do that too). However, it doesn't feel particularly Pythonic (despite being in the stdlib). It also still leaves the signatures fairly complex.

@benhoyt and I talked this over and decided to start with the simpler key-only signatures (simplifying get_container). We can always expand on them later if people find that passing in the object really would be a significant improvement. For container, for example, it's only the difference between get_container(container) and get_container(container.name). More extensive remap type functionality might make it unnecessary, also.