Open CMIP-Data-Request-coleads opened 2 months ago
Apologies that it has taken me so long to get back to this.
I've had a look over the request_classes.py.template
file and this isn't too far off how I was thinking about it, although I would be tempted with a very simple dataclass based structure due to its simplicity when reading the code.
I've re-implemented the software pattern I used for CMIP6 with the JSON file in this repository (was very straightforward with the new structures) in a notebook where I'm only pulling out a few of the attributes for a minimal set of objects along the route between "experiment" and "variable".
The notebook can be found here -- remember that this is illustrative code and for a "real" api there are a number of other steps I would take based on prior experience to ensure we can extract priorities, time slices, etc.
I added a notebook that showcases how to manage the DReq content (following up my PR) and adapts @matthew-mizielinski's illustrative example with the PR content and a small function to map objects between bases.
The TISG meeting on Sept 12th discussed the benefits of having code organised around a clearly defined set of python classes.
The
request_classes.py
script does this in a sense. E.g. it has classes forRecord
andTable
. Each individual table of the request can then be realised as an instance of theTable
class. This provides a lot of flexibility. If an extra table is added in Air Table it will be handled automatically. This approach was also used in dreqPy. The flexibility comes with a drawback: it makes the class structure quite abstract; we don't have documentation read-the-docs compatible documentation associated with each instance of theTable
class.On the other hand Matt suggested having a python class per table, e.g. one for "Variable", one for "Opportunity", etc. This might be achieved by converting the
request_classes.py
Table
to be a base class and create a class per table which can be edited an documented. I've created a template here :request_classes.py.template
. The risk of this approach is potential divergence between documentation in the code and documentation in the database, but I think there are more advantages, particularly in enabling transparency about different python methods being added to different tables.