mikeywaites / kim

Kim: A JSON Serialization and Marshaling framework
http://kim.readthedocs.org/en/latest/
Other
317 stars 17 forks source link

Memoization #121

Closed mikeywaites closed 7 years ago

mikeywaites commented 8 years ago

Kim acts a single point for entry for data into the system via apis. This makes it a great candidate for answering questions such as "Did my data change?" and "what did it change from?".

Field API changes.

We would store a private property on the Field instance called _changes which would store a ref to the changes processed by that field. Each field storing its changes will provide Mapper with a simplified API for retrieving all the changes for all its fields.

Field.__init__
+ self._changes = {}

Storing changes

We only care about changes that occur during marshaling. The most effective place for us to detect any change is in the update_output_to_source.

@pipe(run_if_none=True)
def update_output_to_source(session):
    """Store ``data`` at field.opts.source for a ``field`` inside
    of ``output``

    :param session: Kim pipeline session instance

    :raises: FieldError
    :returns: None
    """

    # memoize = session.field.opts.memoize
    source = session.field.opts.source
    try:
        if source == '__self__':
            attr_or_key_update(session.output, session.data)
        else:
            old_value = attr_or_key(session.output, source)
            new_value = set_attr_or_key(
                session.output, session.field.opts.source, session.data)
            if session.field.opts.get('memoize', False):
                session.field.set_changes(old_value, new_value)
    except (TypeError, AttributeError):
        raise FieldError('output does not support attribute or '
                         'key based set operations')

This would also allow users to easily disable the memoization for certain fields on a field by field basis by letting the user pass memoize=False to FieldOpts

Mapper API Changes

Mapper would also store a changes object which would container the data collected from each field as each field is iterated over.

Mapper.__init__
+ self._changes

+ Mapper.get_changes()

For each successfully marshalled field get_change_from_field() would be called to pull the value changes and store them in Mapper._changes

        for field in fields:
            try:
                field.marshal(self.get_mapper_session(data, output))
                self.get_changes_from_field(field)
            except FieldInvalid as e:
                self.errors[field.name] = e.message
            except MappingInvalid as e:
                # handle errors from nested mappers.
                self.errors[field.name] = e.errors

The Mapper would also expose a method get_changes that would return a serialized version of the mappers changes dict.

Nested

Nested change tracking should be as simple as calling get_changes() on the nested_mapper here. https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/nested.py#L75

session.field.set_field_changes(nested_mapper.get_changes)

The set_field_changes Method on Nested will be overridden to support a non scalar data type.

Collection

Collection change tracking is also supported in a similar manner to Nested. We will simply call collection.set_field_changes(field.get_changes()) for each field that's marshalled in the collection.

https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/collection.py#L45