Annoyance: Figuring out which module has my object

canonical / operator

Pure Python framework for writing Juju charms

Apache License 2.0

246 stars 120 forks source link

Annoyance: Figuring out which module has my object #731

Closed pengale closed 1 year ago

pengale commented 2 years ago

When writing charms, I am often able to remember the name of object that I want (EventBase, StoredState, etc.), but often forget which module it comes from. There's logic to the modules, of course -- CharmBase is in charm, the status objects live in model, and the things that I want when I'm maybe being a little bit hacky live in framework. I can make guesses, and those guesses are easy to check with the introspection tools in an IDE or a good old fashioned grep on the command line.

It's a little mental speed bump, however, of the sort that I think we want to eliminate.

Note: there's a warning about circular imports in ops/__init__.py, which means that we'd need to be thoughtful if we imported more objects into the top level module. It might be worth a spike, however. Lines like self.model.unit.status = MaintenanceStatus("Frobbling the bitzors") would be easier to write if I didn't first have to remember that statuses live in ops.model rather than ops.charm or ops.framework.

rwcarlsen commented 2 years ago

I approve of your annoyance here.

PietroPasotti commented 2 years ago

PyCharm is pretty clever in auto-importing stuff from the right place, so I can't say I feel your pain here. However. For the sake of brainstorming, I can think of a few more options other than adding to the toplevel imports:

1) Do something which I've seen a lot in C++ binding libraries: adding stuff such as statuses as enum-likes inside some 'self'-reachable namespace. One would do: self.unit.status = self.unit.ActiveStatus('msg')

If we are determined enough, we could make it so that the user only has to import CharmBase, the rest (of ops) is reachable from within the charm.

2) A mitigation strategy, which however has some value on its own imho: object-oriented is fun, but it's also frustrating when you have to import objects that do very little and could be replaced by primitives with little to no loss of semantic expressiveness. What if instead of self.unit.status = ActiveStatus(msg) one could do: self.unit.set_status('active', msg='foo')? Seems way more idiomatic, allows us to skip an import, simplifies the code (all the Status classes are boilerplaty). Any downsides? Nope. We can even add type hinting with typing.Literal['active', 'blocked', '...'] to have runtime code checking (and mypy) whine about bugs...

3) Auto-imports. I.e. add from ops.model import *;from ops.framework import * at runtime. Could be even implemented as an interpreter instruction (#!ops), or simply as 'inject all these modules in the charm.py global namespace before executing it'. Risk of variable collision, sure, but nothing that can't be solved with some good ol' AST parse and check.

benhoyt commented 1 year ago

I agree it'd probably be better if ops was designed so -- from the perspective of the user, anyway -- more things lived at the top level, and you could just import ops to get ops.Foo. However, I agree with @PietroPasotti that IDEs (PyCharm, VSCode) help with this a ton.

It's definitely too much churn to restructure this outright. However, it might be worth doing a "simplification" for ops 2.1 as part of the splitting up of big files that we're planning to do anyway, along the lines of putting this in ops/__init__.py:

from .charm import CharmBase, RelationChangedEvent, SecretEvent, ...  # and lots more
from .framework import EventBase
from .main import main
from .model import *  # or if not *, a lot of things

Then for most charms you could just include import ops alone, and use ops.Foo directly.

What do you think, @jameinel?

jameinel commented 1 year ago

On the topic of status setting, there was definitely an appetite for reworking it. Setting an attribute to a specific set of classes was definitely seen as awkward. Especially around Blocked status. (there may be one or more reasons why you could be blocked, and having one of them resolve shouldn't cause the entire blocked status to be resolved, so something along the lines of a named blocker, and then being able to clear out those named blockers individually could easily be a better system.) Having it be a function rather than an attribute is good, you then run into 'but I'd rather use an Enum rather than strings' and then you're back into having to figure out how to import the right enum at the right time. (Enums help with typos and spell checking, but hurt around coupling and imports.)

For the rest of the structure. I do think we currently expose a little bit too much implementation detail (from ops.charm import CharmBase). I think it would be a good opportunity to do a set of 'these are the things you should be touching from ops'. When I look at your list above, I see things like "RelationChangedEvent" which I actually think you shouldn't be importing into your code (unless you need them for typing, or possibly custom events derived from them).

I do think having a richer 'ops' package (eg class MyCharm(ops.CharmBase):), would probably serve us well, and could be a great way to help target what are the objects you really should be creating in your charms. We already have some things which are "Don't instantiate this directly" which feels like the sort of thing that should then not get imported into the top level 'ops' package, leaving us with a relatively clean set of things that you are expected to touch directly and often as a charm author.

So broadly I'm keen on getting a nice level of top level imports, but I think we should approach it with "don't just curry everything, but give a curated ops.* set of things". And where we find it awkward to do a good job, take a real look at it and see if there is a better way to express what we want.

On Wed, Jan 11, 2023 at 9:48 PM Ben Hoyt @.***> wrote:

I agree it'd probably be better if ops was designed so -- from the perspective of the user, anyway -- more things lived at the top level, and you could just import ops to get ops.Foo. However, I agree with @PietroPasotti https://github.com/PietroPasotti that IDEs (PyCharm, VSCode) help with this a ton.

It's definitely too much churn to restructure this outright. However, it might be worth doing a "simplification" for ops 2.1 as part of the splitting up of big files that we're planning to do anyway, along the lines of putting this in ops/init.py:

from .charm import CharmBase, RelationChangedEvent, SecretEvent, ... # and lots morefrom .framework import EventBasefrom .model import # or if not , a lot of things

Then for most charms you could just include import ops alone, and use ops.Foo directly.

What do you think, @jameinel https://github.com/jameinel?

— Reply to this email directly, view it on GitHub https://github.com/canonical/operator/issues/731#issuecomment-1379743134, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABRQ7IRYWB2XCHWJTM3TMTWR5WGBANCNFSM5RR3V2FA . You are receiving this because you were mentioned.Message ID: @.***>

benhoyt commented 1 year ago

Good thoughts @jameinel. I agree with your suggested approach, though I think there might be debate about the details of what goes at the top level and what doesn't when it comes down to it. For example, the FooEvent classes -- we're encouraging the use of type annotations, so that's why those are needed. Anyway, I'll keep this in mind for the restructuring we're doing anyway as part of 2.1 (https://github.com/canonical/operator/issues/884).

jameinel commented 1 year ago

Indeed, I had mentioned typing as something that could be an exception. (though should we have a separate 'ops.typing.X' like the stdlib has you "from typing import List". Anyway, I would try to avoid importing everything, because I think guiding users towards the useful subset is a very helpful thing, but I do agree that having them have to understand the internal structure of ops is not ideal.

John =:->

On Thu, Jan 12, 2023 at 8:38 PM Ben Hoyt @.***> wrote:

Good thoughts @jameinel https://github.com/jameinel. I agree with your suggested approach, though I think there might be debate about the details of what goes at the top level and what doesn't when it comes down to it. For example, the FooEvent classes -- we're encouraging the use of type annotations, so that's why those are needed. Anyway, I'll keep this in mind for the restructuring we're doing anyway as part of 2.1 (#884 https://github.com/canonical/operator/issues/884).

— Reply to this email directly, view it on GitHub https://github.com/canonical/operator/issues/731#issuecomment-1381200563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABRQ7JEA6ESCBE26BFRUTTWSCW2NANCNFSM5RR3V2FA . You are receiving this because you were mentioned.Message ID: @.***>

benhoyt commented 1 year ago

I've taken a look at this, and I have a slightly radical (is that an oxymoron?) proposal: I'm more and more thinking we should import basically all the names -- or almost all the names -- from each ops.foo submodule into the top level (probably via an __all__ list in each submodule and then from .charm import * and so on in ops/__init__.py).

There are a lot of names in some submodules (for example all the event classes), but the names are descriptive and unique (and we could easily have a test guaranteeing the __all__ lists are kept unique), and in some ways the submodule that they're in is an implementation detail: users shouldn't have to know that an ops feature was implemented in ops.myfile.

The exception would be pebble.py, which has a lot of names that most users won't need (and they're not nearly as unique -- it was designed more as a namespaced package). The Pebble functionality should almost always be accessed via model.Container.

Doing this would allow charmers to only have one import at the top of their charm.py or charm lib, either import ops and then ops.Foo, or from ops import Foo and then Foo (depending on taste). Rather than the multiplicity of imports they need now, and figuring out which submodule a name is from.

Writing code would be easier: people's IDEs would easily be able to autocomplete (for example) ops.PebbleReadyEvent (from charm.py) or ops.Secret (from model.py).

I don't think the approach of having a separate ops.typing is great. These are real classes, not just "typing" things. (In fact, the Python type annotations have gone more towards using built-in names -- "real classes" -- where possible, like list[T] instead of typing.List[T] and Foo | Bar instead of typing.Union[Foo, Bar].)

Thoughts @PietroPasotti, @jameinel, @sed-i?

PietroPasotti commented 1 year ago

Totally on board with the radical option @benhoyt, my proposal however is to have an ops.core package collecting everything that's in ops right now, minus the testing stuff.

This would allow us to make ops pure namespace and grow with time a package of native-feeling extensions one could import from ops

We'd need a grace period with dummy ops.charm files that import all from ops.core.charm and spam away deprecationwarning, but I think this could all be done in a backwards-compatible way.

Recap of my proposal:

/ops # no names in here! pure namespace.
- /ops/core # all 'builtin' functionality in ops (minus pebble and testing) in here.
- /ops/core/charm.py, main.py, etc... # all modules currently in /ops now live in /ops/core.
- /ops/pebble # all k8s-specific names (this opens the doors to moving pebble stuff altogether in a separate package by the way, so you don't install it unless you really need it: pip install ops[k8s]. That'd be cool!)
- /ops/develop # development/testing utilities (again, we could move this out to a separate package. pip install ops[dev]?)
- /ops/develop/testing.py # Harness.
- /ops/charm.py # temporary dummy file redirecting to /ops/core/charm.py Deprecated!
- ... and so on for main.py, and all files currently in ops.

benhoyt commented 1 year ago

I'm hoping to consider the idea of extra / namespace package soon, but I don't really like the idea of making ops empty and putting everything in ops[subpackage]. I want it to be as simple as pip install ops and import ops for most people, and it seems like your suggested approach would just make it more difficult for charmers.

I'm not actually too worried about a few dozen KB (for pebble.py or whatever) in charms that don't need it. Source code is not huge, and it seems a small price to pay for keeping things flat and simple to use.

sed-i commented 1 year ago

Looks great. Should the framework file be also excepted in addition to pebble?

Le mar. 21 févr. 2023 17 h 48, Ben Hoyt @.***> a écrit :

I'm hoping to consider the idea of extra / namespace package soon, but I don't really like the idea of making ops empty and putting everything in ops[subpackage]. I want it to be as simple as pip install ops and import ops for most people, and it seems like your suggested approach would just make it more difficult for charmers.

I'm not actually too worried about a few dozen KB (for pebble.py or whatever) in charms that don't need it. Source code is not huge, and it seems a small price to pay for keeping things flat and simple to use.

— Reply to this email directly, view it on GitHub https://github.com/canonical/operator/issues/731#issuecomment-1439192820, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATUW6AH7S6ZBCW6LKKP4P7TWYVA5DANCNFSM5RR3V2FA . You are receiving this because you were mentioned.Message ID: @.***>