planetarypy / pvl

Python implementation of PVL (Parameter Value Language)
BSD 3-Clause "New" or "Revised" License
19 stars 19 forks source link

Improved collections handling proposal #62

Closed rbeyer closed 4 years ago

rbeyer commented 4 years ago

pvl 0.x defines two fundamentally new classes: OrderedMultiDict and Units (and there are thin wrappers around OrderedMultiDict that users are more familiar with: PVLModule, PVLGroup, PVLObject, etc.) in _collections.py.

Initial 1.0.0-alpha work did not alter this module much, nor these classes. Since the OrderedMultiDict as PVLModule is the primary return from the pvl loaders, messing with it has the potential for disruption. The changes I am proposing here should not change anything, they simply define better interfaces for the library to use internally, the loaders still return the old pvl OrderedMultiDict.

However, the work to implement 3rd party quantity objects and the discussions in #52 have made me think about the "interface" we provide, and what operations under the hood expect from various objects.

I think making the following changes will improve our codebase and also make things more logical to external users without breaking things:

1. Change _collections.py to collections.py

The truth is that since we are returning an OrderedMultiDict as a PVLModule object through the loaders, and including Units objects within that structure, the classes and interfaces that are in this module really aren't purely "internal" and so the module should not begin with an underbar, which typically signifies that. This should have no impact, as external users shouldn't have been using _collections by name as the objects were made available under the top-level pvl namespace, and we should continue to do that.

2. Rename the Units class to Quantity

My work with the 3rd party libraries made it clear to me that this class that pvl 0.x named Units, isn't just a 'units' class, it is a combined thing that contains both a value and a units parameter. Other libraries name this kind of thing a Quantity class, which is what I think this is. So I think we should define Quantity (identical to how Units is defined), and use it consistently through our library. The Units class would remain with a deprecation warning.

3. Define a new abstract base class: MutableMappingSequence

The pvl OrderedMultiDict implements both a dict-like Mapping interface, and a list-like Sequence interface. In order to potentially diversify away from pvl defining and maintaining its own OrderedMultiDict at some point in the future, we should define an abstract base class that satisfies our needs (which would just formalize the interface to the already-working OrderedMultiDict), and then genericize the internal workings of the library to deal with classes and objects that satisfy this duck-typed interface. Since it is literally just an object that implements those two interfaces, it is trivial to define:

class MutableMappingSequence(
    MutableMapping, MutableSequence
):
    pass

I don't know that I need to go into the benefits of defining an ABC, but this would allow us to be more specific within the library about the kind of object we need to parse/decode and encode PVL text, and would allow alternate objects (as long as they satisfy the interface) to be used.