yezyilomo / django-restql

Turn your API made with Django REST Framework(DRF) into a GraphQL like API.
https://yezyilomo.github.io/django-restql
MIT License
620 stars 43 forks source link

Allow excluding fields when querying #69

Closed ashleyredzko closed 4 years ago

ashleyredzko commented 5 years ago

When using restql, we found that the filtering as-is is great if there are not a high number of fields. But sometimes we have a case where we'd like everything except a handful of fields on a larger serializer. These fields might be nested and trying the whitelist approach is difficult or possibly too long for the url.

Thoughts on either adding an additional query param for exclusion (such as exclude=...?) or providing some syntax to specify exclusion (something like {course{-books}})?

Definitely willing to help implement, just curious if there's a solid enough case for it. Exclusion doesn't seem too out there to integrate into the field logic on the mixin.

yezyilomo commented 5 years ago

That's a good idea, I thought about adding it back then but the only way I could think of the implementation was using exclude parameter, I honestly didn't like it coz it was introducing another parameter so I dropped it, but I like the idea of this syntax {course{-books}}, it seems promising because with it

Let's discuss that new syntax and check other available options so that we can compare and come up with something which will be great and simple to use.

ashleyredzko commented 5 years ago

I agree that adding an additional parameter doesn't seem like a great idea. In terms of specifying fields to exclude, I think the syntax I provided would be what I would go with, since - is urlencode safe (unreserved), it's pretty straightforward to check for (check startswith("-")), and - has the association with subtraction, which is what we're essentially doing.

ashleyredzko commented 5 years ago

In fact, this wouldn't change how the parser works except for how it checks fields... Let me see if I can get a proposal up.

yezyilomo commented 5 years ago
# Assuming this is the structure of the model we are querying
data = {
    username,
    birthdate,
    location {
        country,
        region
    },
    contact {
        phone,
        email
    }
}

# Here is how we can structure our query to exclude some fields using exclude operator(-)

query = {username, birthdate, location{country}}   ≡   {username, birthdate, location{country}}

query = {-username}   ≡   {birthdate, location{country, region}, contact{phone, email}}

query = {-username, contact{phone}, location{country}}   ≡    {birthdate ,contact{phone}, location{country}}

query = {-contact, location{country}}   ≡    {username, birthdate, location{country}}

query = {-contact, -location}   ≡    {username, birthdate}

query = {username, location{-country}}   ≡    {username, location{region}}

query = {username, location{-country}, contact{-email}}   ≡    {username, birthdate, location{region}, contact{phone}}

# These may happen accidentally as it's very easy/tempting to make these kind of mistakes with the new syntax, 
# So how should we handle these?

# Should not expand excluded field
query = {username, -location{country}}   ≡    {username}  # This should def be a syntax error

# Can we mix include and exclude on the same block/field level? if so how should it be interpreted?

# Unclear whether to include only birthdate or include all fields except username
query = {-username, birthdate}  # Could be a syntax error
query = {birthdate, -username}  # Could be a syntax error

# Should we throw syntax error if this happen ?.
# If no should {-username, birthdate}  ≡   {birthdate, -username} ?.
# Should fields order matter ?.
yezyilomo commented 5 years ago

In fact, this wouldn't change how the parser works except for how it checks fields... Let me see if I can get a proposal up.

It has to change because of this grammar https://github.com/yezyilomo/django-restql/blob/064de4b7d106c7bec5414cf6211daea3206368f6/django_restql/parser.py#L5 This won't allow to start a field name with -

ashleyredzko commented 5 years ago

Makes sense. Looking through your cases, I agree with them, though there might be some logic fixes on a few and I wanted to talk about your questions:

query = {username, location{-country}} ≡ {username, birthdate, location{region}}

The result would actually be {username, location{region}}.

# Should not expand excluded field query = {username, -location{country}} ≡ {username} # This should def be a syntax error

I'm on the fence about this. I agree that this just... feels wrong, but part of me wants to suggest just ignoring nested - queries instead of raising a syntax error.

# Unclear whether to include only birthdate or include all fields except username query = {-username, birthdate} # Could be a syntax error query = {birthdate, -username} # Could be a syntax error

query = {-username, birthdate} ≡ {birthdate} query = {birthdate, -username} ≡ {birthdate} My suggestion is to still treat whitelisting as whitelisting and ignore blacklisted fields that would have been removed anyway.

One concern I have is specifying the same field for whitelisting and blacklisting, which I think should be a syntax error. E.g. query = {username, -username}.

# Should fields order matter ?.

I don't think so. We can get the list of included/excluded fields and perform set logic on them after the fact.

yezyilomo commented 5 years ago

query = {username, location{-country}} ≡ {username, birthdate, location{region}}

The result would actually be {username, location{region}}.

You are right about this, didn't see it.

# Should not expand excluded field query = {username, -location{country}} ≡ {username} # This should def be a syntax error

I'm on the fence about this. I agree that this just... feels wrong, but part of me wants to suggest just ignoring nested - queries instead of raising a syntax error.

I think ignoring it might encourage users to make the same mistakes again and again.

query = {-username, birthdate} ≡ {birthdate} query = {birthdate, -username} ≡ {birthdate} My suggestion is to still treat whitelisting as whitelisting and ignore blacklisted fields that would have been removed anyway.

I want to agree with you but again I think these kind of mistakes might encourage users to make the same mistakes over and over.

yezyilomo commented 5 years ago

My suggestion is that a field level should either be whitelisting or blacklisting but not both.

ashleyredzko commented 5 years ago

# Should not expand excluded field query = {username, -location{country}} ≡ {username} # This should def be a syntax error

I'm on the fence about this. I agree that this just... feels wrong, but part of me wants to suggest just ignoring nested - queries instead of raising a syntax error.

I think ignoring it might encourage users to make the same mistakes again and again.

I'm less concerned about what mistakes are made and more about what the expectations about the syntax are. If I include that query, do I provide location? What about country, which I explicitly asked for? I agree we should raise a syntax error if encountering an exclude at the top level of a nested value.

query = {-username, birthdate} ≡ {birthdate} query = {birthdate, -username} ≡ {birthdate} My suggestion is to still treat whitelisting as whitelisting and ignore blacklisted fields that would have been removed anyway.

I want to agree with you but again I think these kind of mistakes might encourage users to make the same mistakes over and over.

This is in line with set logic though. Excluding something if it's already excluded would just yield the existing set. I think this is a safe "mistake" to make. Though, I do agree that it makes it easier to parse if we don't allow mixing, and makes it consistent with how serializers do not allow both fields and exclude on definition.

yezyilomo commented 5 years ago

My suggestion is that a field level should either be whitelisting or blacklisting but not both.

I think allowing whitelisting and blacklisting on a field level at the same time is kinda weird and confusing and it's something that should be avoided not by just assuming the user wanted to whitelist.

yezyilomo commented 5 years ago

I agree that adding an additional parameter doesn't seem like a great idea. In terms of specifying fields to exclude, I think the syntax I provided would be what I would go with, since - is urlencode safe (unreserved), it's pretty straightforward to check for (check startswith("-")), and - has the association with subtraction, which is what we're essentially doing.

I was trying to rewrite a parser, I just realized that a parser can handle everything including checking for syntax errors on a query and determine whitelisted and blacklisted fields without even communicating with a serializer, so this means with a parsed query we will be able to know which fields are blacklisted and which fields are whitelisted without even using startswith('-'), in fact we won't even need to write logics for handling exclude operator(-) on a serializer, the only thing we have to write is function which will be removing those blacklisted fields leaving out the whitelisted ones.

yezyilomo commented 4 years ago

I think our syntax doesn't cover this scenario.

# Assuming this is the structure of the model we are querying
user = {
    username,
    birthdate,
    location {
        country,
        region
    }
}

How should we write a query using exclude operator(-) if we want to get all user fields but under location we want to get only country?.

ashleyredzko commented 4 years ago

@yezyilomo Good catch. We could nest the excludes that way. So if it's nested, instead of removing the parent, apply the filter from the child.

query = "{-location}"
# Returns username, birthdate.

query = "{-location{country}}"
# Returns username, birthday, location->country

The way to check I imagine would be a "treat this as whitelist" vs "treat this as blacklist". If we see a -, we know we should not effect other fields based on it being present. Otherwise, we remove other fields we don't see at that level comparatively. Thoughts on this approach?

yezyilomo commented 4 years ago

Since we are treating - as exclude operator {-location{country}} feels more like excluding location field than including all fields, also if we are expanding more than one nested field it would require prepending all of them with - e.g

query={-location{country}, -contact{phone}, -group{name}}

My idea is not very different from yours, since -username means exclude username field then - alone would mean exclude no field(which means include all fields), so with this approach our query could be written as

query={-, location{country}}

and if you are expanding more than one field there will be no need to prepend all of them with -

query={-, location{country}, contact{phone}, group{name}}

Does it make sense?..

yezyilomo commented 4 years ago

You know I was afraid using asterisk(*) on a query because I thought it was a reserved character on URI, URL, and URN but it's not, I found out after reading this https://www.ietf.org/rfc/rfc2396.txt, So this means we could write our query as

query={*, location{country}}

Which I think makes more sense than query={-, location{country}} or query={-location{country}} given how asterisk(*) is used in programming especially in regex. In this case asterisk(*) would means include all fields, what do you think?

yezyilomo commented 4 years ago

I think it's easy for anyone to guess that query={*} is accessing all fields.

ashleyredzko commented 4 years ago

This is more an implicit vs explicit argument/approach, and at this point I think I'd like to suggest a second query param for exclusion to try and remedy this. I can't think of an intuitive way to include query scenarios like that in a single query of mixed include/exclude.

yezyilomo commented 4 years ago

How about this

query={*, location{country}}

Is there any problem using "*"?

yezyilomo commented 4 years ago

This is more an implicit vs explicit argument/approach, and at this point I think I'd like to suggest a second query param for exclusion to try and remedy this. I can't think of an intuitive way to include query scenarios like that in a single query of mixed include/exclude.

I don't know but the idea of using another parameter just to exclude few fields feels not right. I tried implementing it once but I was not impressed by how it was used, I mean we expect a query parameter to do everything for us but if we need another parameter to do the extra work I don't think if we should be calling it query anymore.

ashleyredzko commented 4 years ago

I can see how * could work, but I feel it's about as confusing as adding the - to the parent field name when you want to exclude it's children. I guess it's more of a preference at this point.

yezyilomo commented 4 years ago

Okay let's give it a shot and see how it goes, I've also seen it used on dynamic-rest library so at least its kinda not new. Do you think you can refactor #67 to support this? I want to merge both #70 and #67.