Add and document functions for discarding one or other item from a pair

j4mie commented 2 years ago

Previously we had (undocumented, internal) functions called prepare_only and project_only. These were ambiguously named and didn't add much value. They have been removed in favour of directly using qs.noop or projectors.noop in a pair.

We now also have a couple of functions for discarding one or other item in the pair. This might seem a bit weird given that they are strictly equivalent to pair[0] or pair[1], but they result in much more readable code because the function name comes before the child spec rather than being an easy-to-miss [0] or [1] tagged on at the end. For example, if you wanted to use select_related instead of prefetch_related for a simple relationship:

spec = [
    "name",
    "dob",
    (
        qs.select_related_fields("favourite_widget__size", "favourite_widget__type"),
        pairs.discard_queryset_function(
            specs.process_item({"favourite_widget": ["size", "type"]})
        ),
    ),
]

Comments very welcome, particularly on:

Naming
Whether this is even necessary at all

pmg103 commented 2 years ago

I think the whole requirement for this comes about because of the lib insisting on using a two-tuple instead of a richer data structure (such as a named tuple). Having long named functions to get the zeroth or oneth element of a pair seems to me to INCREASE the surface area of the API, and mean there's now two ways to do the same thing to understand... For me, it's not worth it and you should instead change the tuple to a named tuple such that you can go

specs.process_item({"favourite_widget": ["size", "type"]}).queryset

and

specs.process_item({"favourite_widget": ["size", "type"]}).projector

to get the two parts

j4mie commented 2 years ago

@pmg103

I think the whole requirement for this comes about because of the lib insisting on using a two-tuple instead of a richer data structure (such as a named tuple).

Maybe, but having functions that operate on primitive data structures is deliberately how I've designed the API here. For me a two-tuple is just not worth turning into a named tuple.

Having long named functions to get the zeroth or oneth element of a pair seems to me to INCREASE the surface area of the API, and mean there's now two ways to do the same thing to understand...

This is fair, and hence the "whether this is even necessary at all" bit of the description. I would be happy to just not bother doing this, but for me having the function name before the thing is more readable than a [0] or [1] at the end.

For me, it's not worth it and you should instead change the tuple to a named tuple

This would be a big change to every bit of the code in the library and would probably break existing client code. If it were a ten-tuple or something then sure, but I personally like the positional nature of the two tuple - it "feels" to me like the second thing depends on the first thing, in a way that it doesn't if we switched to attribute access. It would also actually require two different namedtuples - a QuerysetFunctionAndProducerTuple and a QuerysetFunctionAndProjectorTuple which I think would confuse things even further (this is an implementation detail that is mostly hidden at the moment). I feel like overall the result would be worse.

So I guess it boils down to: assuming we aren't switching to a named tuple, is having these functions better or worse than using [0] and [1] at the end?

pmg103 commented 2 years ago

Wow so sometimes the two tuple is a queryset and a producer, and sometimes its a queryset and a projector? :bigsad:

So you could call pairs.discard_projector() on a (queryset, producer) to get the queryset, or call pairs.discard_producer() on a (queryset, projector) and it'd work just fine....

This makes me think if you really want to keep it as a tuple as a first class concept, then the client code should access it like a tuple, ie with [0] and [1]. Special functions suggest there's something more going on.

j4mie commented 2 years ago

Wow so sometimes the two tuple is a queryset and a producer, and sometimes its a queryset and a projector? :bigsad:

This was of course your idea 😆

This makes me think if you really want to keep it as a tuple as a first class concept, then the client code should access it like a tuple, ie with [0] and [1]. Special functions suggest there's something more going on.

I'm not sure I agree on the tradeoff there. Much of this library already consists of functions operating on tuples or other primitive data structures. I don't think that's anything new or surprising.

Without this change, you have a big spec fragment but right down at the end of it you have a little [1] to say you're discarding the queryset function part. That comes possibly after many lines of spec and so is perhaps easy to miss. With this change, you start the fragment of spec with I AM GOING TO DISCARD THE QUERYSET FUNCTION FROM THIS SPEC, then you have the spec. I find that has big readability benefits.

If I've still failed to persuade you then I'll open a new PR that just removes the internal prepare_only and project_only functions, which I still thing is a valid bit of cleanup.

dabapps / django-readers

Add and document functions for discarding one or other item from a pair #49