ORM make fields optional

NicoHood commented 1 year ago

Hi, as you can see I am playing around with ORM.

I have a table that has lots of attributes, some of them are computed fields. I know we have a PR about read-only computed fields, but that is not totally related here.

So I just want to access a subset of the tables fields. Like I want to ignore the computed fields at all. Currently the code fails with a KeyError exception.

It should be possible to also only update the fields that were specified. The docs say:

To update Event Datenbank records, issue a request to the Event Datenbank endpoint. Table names and table ids can be used interchangeably. Using table ids means table name changes do not require modifications to your API request. A PATCH request will only update the fields you specify, leaving the rest as they were. A PUT request will perform a destructive update and clear all unspecified cell values.

NicoHood commented 1 year ago

I dont know if this should be the default, or optinal, but here is a potential patch:

    @classmethod
    def from_record(cls: Type[T], record: dict, skip_unknown: bool = False) -> T:
        """Create instance from record dictionary"""
        name_attr_map = cls._field_name_attribute_map()
        name_field_map = cls._field_name_descriptor_map()
        try:
            # Convert Column Names into model field names
            # Use field's to_internal to cast into model fields
            kwargs = {}
            for key, value in record["fields"].items():
                # Skip fields that our ORM does not know
                if skip_unknown and key not in name_attr_map:
                    continue
                kwargs[name_attr_map[key]] = name_field_map[key].to_internal_value(value)

NicoHood commented 1 year ago

Maybe it could also be a general meta setting for the ORM.

mesozoic commented 1 year ago

There are two approaches we could take here:

Any additional fields we get from Airtable can be ignored because the implementer didn't declare them in the ORM.
ORM should query for just the fields it needs, and any additional fields are a bug (and should raise an exception).

In option 1, we would do something like what @NicoHood proposes above, where we catch the KeyError and silently drop any extra data we receive. This would allow an implementer to pass arbitrary record dicts into Model.from_record and be confident that it will work. I'm not sure if there's a reasonable use case for that, but I think it's the one advantage.

In option 2, we'd update Model.all, Model.first, etc. to explicitly pass fields= with the list of field names that were declared in the model. Airtable might not return all of them (because it excludes fields entirely out if their values are null) but it should never return a field name we aren't expecting. If Model.from_record receives extra field names, it's probably a bug (maybe ours, or maybe the implementer's).

I have a slight lean toward option 2 only because silently dropping invalid data feels like a framework smell. It leaves open the possibility that the implementer has done something wrong and the framework is not going to stop them early enough. I'm very open to hearing other advantages of option 1 that I might have missed.

mesozoic commented 1 year ago

On further inspection it looks like the Get Record endpoint doesn't accept the fields[]= query parameter, which probably rules out option 2 (because I don't think we want "get one record" and "get many records" to behave differently).

gtalarico / pyairtable

ORM make fields optional #190