NOAA-GFDL / xwmt

Python package for water mass transformation analysis that leverages xarray functionality
https://xwmt.readthedocs.io/
MIT License
9 stars 5 forks source link

Arrangement of the `class` object #8

Open gmacgilchrist opened 2 years ago

gmacgilchrist commented 2 years ago

The approach in the code is to initiate a class object from the input dataset. Within that object, there is the original dataset (now slightly modified), new variables (e.g. alpha and beta), and a selection of functions, ranging from the density calculation (.get_density()) to the full wmt calculation (.G). I found that the arrangement of this class was a bit confusing, and some improvements might help with usability and debugging.

Some examples:

hdrake commented 1 year ago

What about one overarching class WaterMass object? The core watermass transformations methods could either be methods of an inherited class, e.g. TransformedWaterMass(WaterMass) and SurfaceWaterMass(WaterMass), or just functions that act on the WaterMass?

This approach would be nice because then xwmb.WaterMassBudget(WaterMass) could also inherit the class structure.

What do you think @gmacgilchrist and @jetesdal?

gmacgilchrist commented 1 year ago

I like the idea of one over-arching WaterMass class, with a set of its own functions (e.g. to derive the density, or specify scalar bins), which saves any of these from functions from being repeated.

If the 3D and surface-forced watermass transformations remain separate, they could each be a subclass of the WaterMass class. In a totally different circumstance, this is similar to what is done in climpred package, where there is a PredictionEnsemble class, which holds all of the core functions, then PerfectEnsemble and HindcastEnsemble objects that allow specific functionality for these different types of ensemble.

On the other hand (separate to the subclass or inherited class approach), it would be cool to have the capacity to have a WaterMass.transform(lambda=lambda) method (where lambda is the tracer across which you want to calculate the transformation). If the WaterMass class contains all of the relevant information about the fluxes, tendencies, and scalars, (as well as a specification of the lambda bins) it should be able to return a dataset with the transformation due to each of the tendencies or fluxes that are present in dataset within WaterMass.

In the case of the WaterMassBudget, the WaterMass class dataset would just additionally need to have information to calculate Psi and dMdt. Is that the sort of thing that you had in mind, @hdrake ?

hdrake commented 1 year ago

If the 3D and surface-forced watermass transformations remain separate, they could each be a subclass of the WaterMass class. In a totally different circumstance, this is similar to what is done in climpred package, where there is a PredictionEnsemble class, which holds all of the core functions, then PerfectEnsemble and HindcastEnsemble objects that allow specific functionality for these different types of ensemble.

Yes, that's exactly what I had in mind.

On the other hand (separate to the subclass or inherited class approach), it would be cool to have the capacity to have a WaterMass.transform(lambda=lambda) method (where lambda is the tracer across which you want to calculate the transformation). If the WaterMass class contains all of the relevant information about the fluxes, tendencies, and scalars, (as well as a specification of the lambda bins) it should be able to return a dataset with the transformation due to each of the tendencies or fluxes that are present in dataset within WaterMass.

I do not see how the TransformedWaterMass would be different than this dataset you're talking about, but would provide much more structure. At the most basic level, we could just have TransformedWaterMass.ds be the transformed version of WaterMass.ds. It is useful to structure this as a class instead of just a method that returns a dataset because then we can work with methods and type checks, rather than have to keep track of a bunch of different datasets and what they contain, and a bunch of functions and what kind of dataset they operate on. Basically these child classes would be wrappers for functions like transform that provide structure and metadata, without any downsides (e.g. in terms of performance of flexibility).

In the case of the WaterMassBudget, the WaterMass class dataset would just additionally need to have information to calculate Psi and dMdt. Is that the sort of thing that you had in mind, @hdrake? Correct.

gmacgilchrist commented 1 year ago

Yep, OK, I think I grasp what you're saying, and like the idea.

hdrake commented 1 year ago

Ok, @gmacgilchrist, I've got a basic version of what I imagined working (see https://github.com/NOAA-GFDL/xwmt/pull/38/commits/112ff3bea395b2443096b82d816b7fb67d890dda#diff-928cc93384ef8216501d5c095aa443b9b699543f74dc090032f71fb62c74e52e) but I basically had to blow everything up to do it. It should be straight forward to get back to supporting all of the old functionality, but it is certainly a major breaking change. I think it is worth it though.