Crunch-io / scrunch

Pythonic scripting library for cleaning data in Crunch
GNU Lesser General Public License v3.0
5 stars 7 forks source link

Changes needed to support categorical dates #407

Open jamesrkg opened 3 years ago

jamesrkg commented 3 years ago

Related to https://github.com/Crunch-io/scrunch/issues/387.

Two changes needed to support categorical dates (as far as I can tell):

  1. Variable._MUTABLE_ATTRIBUTES
class Variable(ReadOnly, DatasetSubvariablesMixin):
    """
    A pycrunch.shoji.Entity wrapper that provides variable-specific methods.
    DatasetSubvariablesMixin provides for subvariable interactions.
    """
    _MUTABLE_ATTRIBUTES = {'name', 'description', 'uniform_basis',
                           'view', 'notes', 'format', 'derived', 'date'}
    _IMMUTABLE_ATTRIBUTES = {'id', 'alias', 'type', 'discarded'}
  1. Variable.add_category:
    def add_category(self, id, name, numeric_value, missing=False, before_id=False, date=None):
        if self.resource.body['type'] not in CATEGORICAL_TYPES:
            raise TypeError(
                "Variable of type %s do not have categories"
                % self.resource.body.type)

        if self.resource.body.get('derivation'):
            raise TypeError("Cannot add categories on derived variables. Re-derive with the appropriate expression")

        categories = self.resource.body['categories']

        new_category = {
            'id': id,
            'missing': missing,
            'name': name,
            'numeric_value': numeric_value
        }
        if date is not None:
            new_category['date'] = date

        if before_id:
            # only accept int type
            assert isinstance(before_id, int)

            # see if id exist
            try:
                self.categories[before_id]
            except:
                raise AttributeError('before_id not found: {}'.format(before_id))

            new_categories = []
            for category in categories:
                if category['id'] == before_id:
                    new_categories.append(new_category)
                new_categories.append(category)
            categories = new_categories
        else:
            categories.append(new_category)

        resp = self.resource.edit(categories=categories)
        self._reload_variables()
        return resp
jamesrkg commented 3 years ago

One more exception occurs when attempting to view the categories for a categorical date variable with ds[var].categories. In this case the error is KeyError: 'numeric_value'. I'm not sure what the fix for this is.

jjdelc commented 3 years ago

Took care of

Did not take care of Variable.date that is not an attribute of the variable to make mutable.

jamesrkg commented 3 years ago

Thanks @jjdelc!