Open lewisjared opened 10 months ago
In reality, the soln could be a mixture of all three
This seems pretty sensible. I assume that there's no good way to avoid inheritance for the arithmetic (I have always found mix-ins very opaque but having a huge definition file also seems painful so maybe it is the best tradeoff here)?
For nice developer experience with primap2, I did some horrible things with pyi stubs: https://github.com/pik-primap/primap2/blob/main/primap-stubs.patch
However, since we are controlling the classes here, we probably don't have to do anything like that. The problem in primap2 is that we are providing new methods on xarray classes directly, using the xarray mechanism to register that. Honestly, the amount of weird stuff you have to do to support the pythonic API way of "method on class" makes me question if it is really worth it. Is
a = run.filter(category=x)
really so much better than
import scmdata as sd
a = sd.filter(run, category=x)
to warrant the added complexity?
I know everyone likes the pipe-style of chaining multiple things, so run.a().b().c()
instead of c(b(a(run)))
, but still, in the primap2 developer docs there is a section on how to add a function. You'd hope that you don't have to write 2000 characters on how to add a function.
But enough philosophical musings, back to the topic at hand: your proposal looks sensible. There is a nice hierarchy of "core" (inheritance) to less "core" (composition, methods) functionality. I'd maybe opt for a huge class with all the arithmetic over inheritance because MixIns are really super opaque, but either way is a solid compromise.
I know everyone likes the pipe-style of chaining multiple things, so run.a().b().c() instead of c(b(a(run)))
I would pay a stupidly high price to get chaining. I don't know why, it is just so much easier to reason for me.
Fair enough. Maybe the reasonable compromise is: pay for chaining for functionality which chains well (e.g. filter
), but maybe don't pay for chaining for functionality which is a dead end (e.g. plotting). So the norm for plotting would be:
data = run.a().b().c()
sd.lineplot(data, plotting_args)
I honestly think that is even better than
run.a().b().c().lineplot(plotting_args)
simply because you can set breakpoints on the data =
line and inspect the data to plot, or make another plot with the same data etc.
Yep smart
On Fri, Sep 8, 2023 at 10:18 AM Mika Pflüger @.***> wrote:
Fair enough. Maybe the reasonable compromise is: pay for chaining for functionality which chains well (e.g. filter), but maybe don't pay for chaining for functionality which is a dead end (e.g. plotting). So the norm for plotting would be:
data = run.a().b().c()sd.lineplot(data, plotting_args)
I honestly think that is even better than
run.a().b().c().lineplot(plotting_args)
simply because you can set breakpoints on the data = line and inspect the data to plot, or make another plot with the same data etc.
— Reply to this email directly, view it on GitHub https://github.com/openscm/scmdata/issues/253#issuecomment-1711269511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFUH5G2K6BEWPNWCFEG4ZYTXZLIG7ANCNFSM6AAAAAA4PXVKHY . You are receiving this because you commented.Message ID: @.***>
Describe the bug
Type hints for the injected functions on a ScmRun are not properly handled. Pycharm has no idea that
lineplot
is an available function which is a pretty poor developer experience.With large container classes we want to provide a range of behaviour without having to lump it all in the same module which is difficult to understand
Currently we inject functionality for:
We also inherit an "OpsMixin" baseclass which correctly sets some magic methods
There are a few ways of achieving this:
Pandas doesn't use inheritance in the same way, but simply has a huge frame.py file
Use composition to perform the same action. That might work well for plotting as it seems to be the common pattern used across xarray and pandas. e.g.
df.plot.line
ords.plot.scatter
. This might cause a large change to the API without some relatively opaque redirection. See https://github.com/pandas-dev/pandas/blob/d04747c367d00ee03c5d008ce5670892d450e801/pandas/core/series.py#L5782 for pandas accessorsAdd the IO functions to BaseScmRun, removing the need for the injection. The lower level functionality can happen elsewhere in
scmdata.netcdf
, but the method knows what to call.In reality, the soln could be a mixture of all three. plotting = composition arithmetic = inheritance xarray and NC = methods
@mikapfl Any other suggestions or thoughts?
Failing Test
Pycharm autocompletes
lineplot
with the correct type hint information