Reimplement with django-readers

j4mie commented 3 years ago

This rips out the guts of SSM and replaces it with an implementation based on django-readers.

A few notes:

~This is based on an as-yet-unreleased version of django-readers. PRs for all required features are open but not yet merged or released.~

Many of the tests that have changed are simply adding one extra query, because django-readers always uses prefetch_related, whereas SSM sometimes cleverly uses select_related.

An SSM spec is not the same thing as a django-readers spec. SSM includes a preprocessing step which supports extra syntax such as plugins {"name": SomePlugin()}, Aliased, Filtered etc. Each of these SSM-only concepts is converted to a pair, and the spec is then passed into specs.process as normal. It may be that this is not desirable in the long run, and we should deprecate plugins entirely. TBD.

The feature of including a relationship name as a bare string in a spec, and having that be converted into a list of IDs of related objects, has been removed. This feature is undocumented in SSM as-is, would be very complicated to implement in django-readers and is (I believe) fairly surprising. Instead, the new pk_list pair function in django-readers should be used instead. Replace "foo_set" in your specs with pk_list("foo_set").

This is a bit more subtle: this implementation doesn't actually really use Serializer at all. It uses a thing that quacks like a Serializer (ie it has a .data attribute) but in reality it's just calling the django-readers projector function, which just plucks the field attributes directly off the model instances. This creates a slight a change in behaviour for some field types. A DRF UUIDField, for example, converts a UUID instance to a string itself in its to_representation method, so by the time it hits the renderer it is already just a JSON-renderable value (a string). The same is true for things like date and datetime.

However, DRF's JSON encoder cleverly also has support for encoding lots of higher-level datatypes itself - so if you just return a UUID or a date or a datetime from your view then it will still be rendered correctly.

The upshot of this is that if your tests are doing things like:

self.assertEqual(response.data["results"][0]["id"], str(some_test_object.id))

You'll need to change that to:

self.assertEqual(response.data["results"][0]["id"], some_test_object.id)

This is because response.data in a test is the raw data that comes back from the serializer, before it gets passed into the renderer.

So this produces some test failures when installing this new version of SSM in a real project, but these are super easy to fix.

Slightly more complicated: if you're using some custom model fields that return something that isn't a JSON-serializable value from to_python, you'll need to ~write a very simple custom plugin or pair to convert the value to a JSON-serializable type during projection. eg (qs.include_field("my_custom_field"), lambda instance: {"my_custom_field": str(instance.my_custom_field)}). This could of course be wrapped up in a higher-order function that returns this pair.~ transform the value to a JSON-serializable type during projection, eg specs.field("my_custom_field", transform_value=str)

~There is at least one actual bug still here (I think), despite the tests all passing. When installing this in a real project and running the tests, I get a few TypeError: 'Filtered' object is not iterable, which is a bit odd. Needs investigating.~

j4mie commented 3 years ago

Thoughts on a plan for this going forward:

Merge this PR and release as a 0.0.x version.
Upgrade existing projects to run on this version (PRs already open)
Remove all plugins, Filtered/Aliased etc, and the preprocessing step. SSM becomes simply a way of using a django-readers spec in a DRF view, with no added cleverness.
Release this as version 1.0.0 and use this as the base for projects going forward.

Below are approximate django-readers versions of all the existing functionality. I propose that we don't ship special shortcuts for any of this - we let end users write them out as below in their specs. The only one that's significantly longer is the CountOf replacement, but I think the bit of extra typing more than pays for itself in clarity - it's just idiomatic Django queryset stuff.

{"title": Aliased("name")}

becomes

specs.alias("title", "name")

{'teacher_set': Filtered(Q(name__icontains='cat'), [
    'name'
])}

becomes

{"teacher_set": [
    pairs.filter(name__icontains="cat"),
    "name",
]}

{'classes_count': CountOf('classes')},

becomes

(qs.annotate(classes_count=Count("classes", distinct=True)), projectors.attr("classes_count"))

{'has_breeds': Exists('breeds')}

becomes

(qs.annotate(has_breeds=Exists("breeds")), projectors.attr("has_breeds"))

{'status': Requires(['age'])}

becomes

(qs.include_fields("age"), projectors.attr("status"))

{'some_method': MethodCall("some_method", required_fields=["some_field]")}

becomes

(qs.include_fields("some_field"), projectors.method("some_method"))

pmg103 commented 3 years ago

Some of the becomes in that list are "short DSL-like element X become long verbose python expression Y".

SSM's 'specs' are a DSL for specifying endpoints declaratively. The implementation is there to allow this spec to be parsed to create the query so that it worries about implementation details such as N-queries problems so you don't have to. But the DSL you are putting into the user's hands is just as if not more important.

I'd like SSM's DSL to stay as concise and expressive as it is -- or ideally, get MORE concise and expressive. I don't have any objection to its implementation being rewritten as you have done because there's nothing particularly pleasant about the existing one. But having any constraints in that implementation mean that the DSL loses conciseness or expressivity sounds like the tail wagging the dog and implementation driving the interface which I'm not really happy about.

j4mie commented 3 years ago

@pmg103 I think there's a balance to be struck. The SSM spec DSL is brilliant, but I think that if a DSL becomes too clever, it risks obscuring what's actually going on.

Part of the point of django-readers was to put some layers in below the DSL that all make sense independently, making it easier to drop down a layer and do something most custom. The nice thing (IMO) about the django-readers queryset functions is that they're just Django: there's no special magic going on, and anyone who knows Django's ORM should be able to understand them at a glance without much difficulty. I think this has benefits that must be traded off against conciseness.

That said, (and assuming the "long verbose python expressions" you're referring to are the CountOf and Exists replacements), I think count and exists are probably common enough that I'd be happy to include shortcut functions for them under django_readers.pairs, like this:

def count(field_name, distinct=True):
    attr_name = f"{field_name}_count"
    return (
        qs.annotate(**{attr_name: Count(field_name, distinct=distinct)}),
        projectors.attr(attr_name),
    )

def exists(field_name):
    attr_name = f"{field_name}_exists"
    return (qs.annotate(**{attr_name: Exists(field_name)}), projectors.attr(attr_name))

pmg103 commented 3 years ago

I think the thing to aim for is a nice composable fractal implementation that allows you to drop down a level simply and easily AND provide convenience functions for common patterns. That seems like the best balance.

j4mie commented 3 years ago

count and exists implemented here: https://github.com/dabapps/django-readers/pull/39

j4mie commented 3 years ago

I've now moved a big chunk of the implementation into django-readers itself. So this PR would really just make SSM a compatibility layer for plugins on top of pairs.

dabapps / django-rest-framework-serialization-spec

Reimplement with django-readers #68