yezyilomo / django-restql

Turn your API made with Django REST Framework(DRF) into a GraphQL like API.
https://yezyilomo.github.io/django-restql
MIT License
616 stars 43 forks source link

Minor bug in NestedField if nested serializer allows empty data #165

Closed resurrexi closed 4 years ago

resurrexi commented 4 years ago

I ran into an issue using NestedFields when it wraps a serializer that allows no data body when POSTing.

What I want to be able to achieve is embed the a serializer as a nested serializer in another serializer. With this nested serializer, I want to be able to either:

1) Automatically create the instance for the nested serializer if no data is available for the serializer. 2) Create the instance if data is available for the nested serializer.

The problem I face is that in order to achieve bullet point 1, I can't wrap the nested serializer field in NestedField, and I would have to then override create and update in the parent serializer to achieve bullet point 2.

As an example, consider the following model definitions and their serializer:

# models.py
from django.db import models
from django.utils.crypto import get_random_string
from django.utils.timezone import now

class LotNumber(models.Model):
    lot_number = models.CharField(max_length=6, primary_key=True, blank=True)
    is_active = models.BooleanField(default=True)
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    @staticmethod
    def _generate_lot_number(length=6):
        return get_random_string(length=length)

    def save(self, *args, **kwargs):
        if not self.lot_number:
            self.lot_number = self._generate_lot_number(length=6)
        super().save(*args, **kwargs)

class Inventory(models.Model):
    sku_code = models.CharField(max_length=32)
    lot_number = models.ForeignKey(LotNumber, blank=True, null=True, on_delete=models.PROTECT)
    finish_date = models.DateField(default=now)

    def save(self, *args, **kwargs):
        if not self.lot_number:
            lot_class = self._meta.get_field("lot_number").related_model
            new_lot = lot_class.objects.create()
            self.lot_number = new_lot
        super().save(*args, **kwargs)

# serializers.py
from rest_framework import serializers

from django_restql.serializers import NestedModelSerializer
from django_restql.fields import NestedField

from .models import LotNumber, Inventory

class LotNumberSerializer(serializers.ModelSerializer):
    class Meta:
        model = LotNumber
        fields = "__all__"

class InventorySerializer(serializers.ModelSerializer):
    lot_number = LotNumberSerializer(required=False)

    class Meta:
        model = Inventory
        fields = "__all__"

class NestedInventorySerializer(NestedModelSerializer):
    lot_number = NestedField(LotNumberSerializer, required=False)

    class Meta:
        model = Inventory
        fields = "__all__"

If I were to make a POST request at the InventorySerializer endpoint, I would be able to create an Inventory instance without passing data for the lot_number field because the save method in the LotNumber model automatically creates the data. The LotNumberSerializer would also consider the empty data as valid. The response would be something like:

{
    "id": 1,
    "sku_code": "ABC-PRODUCT",
    "lot_number": {
        "lot_number": "P519CK",
        "is_active": true,
        "created_at": "2020-03-18T12:12:39.943000Z",
        "updated_at": "2020-03-18T12:12:39.943000Z"
    }
}

This only achieves bullet point 1 from earlier. If I wanted to inject my own lot_number value, I have to override the create and update methods in InventorySerializer, which complicates things.

Now, if I were to make a POST request at the NestedInventorySerializer endpoint, I would be able to create the Inventory instance if I did pass data for the lot_number field. However, if I attempt bullet point 1 on this serializer, I get the following response:

{
    "lot_number": {
        "non_field_errors": [
            "No data provided"
        ]
    }
}

I found a temporary fix whereby requiring the lot_number field in the parent serializer and setting a default={} on the field resolved the problem. However I don't think this aligns with the intent of the NestedField wrapper, where arguments should behave the same way as when the wrapper is not used.

Instead, I think the fix for this is in fields.py in the to_internal_value method of the BaseNestedFieldSerializer class. Specifically in line 278, the code should be data = {}. Technically, the code could be moved to line 275 and lines 276-278 can be deleted. This fix should not affect scenarios where a nested serializer does require data and data isn't passed, due to the fact that calling is_valid in this scenario will return False.

yezyilomo commented 4 years ago

If you use required=False on a nested field(foreign key related to be specific) you need to specify allow_null=True because the value of nested field(on a foreign key related field) is always needed so it's either null or the value you give. I didn't put this on the docs because it's explained in the official DRF documentation here https://www.django-rest-framework.org/api-guide/fields/#allow_null

So your serializer should be like


class NestedInventorySerializer(NestedModelSerializer):
    lot_number = NestedField(LotNumberSerializer, required=False, allow_null=True)

    class Meta:
        model = Inventory
        fields = "__all__"
yezyilomo commented 4 years ago

And don't use default={} that's totally a different thing. You might want to read https://www.django-rest-framework.org/api-guide/fields/#default for more details about default kwarg. These three kwargs default, required and allow_null relates so much, but each has its own purpose, so it's very important to know when to use what.

yezyilomo commented 4 years ago

You might also want to check these tests, because they match your case https://github.com/yezyilomo/django-restql/blob/62a98a2a8670a580f534724391376437d87bf319/tests/testapp/serializers.py#L107-L142 From these tests you can clearly see when allow_null is used and when it's not used.

resurrexi commented 4 years ago

Thanks for your advice and sharing the links. That's interesting that allow_null is required when wrapping it in a NestedField but not needed without the wrapper.

I used default={} because I was under the impression that DRF was automatically setting the data attribute as an empty dict if there was no data being passed into the request body. For instance, I had no problems running POST on the LotNumberSerializer endpoint with no input data. I tried to figure out what DRF was doing behind the scenes, so I attempted to play around with different values of data in my serializer:

>>> from serializers import LotNumberSerializer
>>> from rest_framework.fields import empty
>>> serializer = LotNumberSerializer(data=empty)
>>> serializer.is_valid()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/venv/lib/python3.8/site-packages/rest_framework/serializers.py", line 227, in is_valid
    assert hasattr(self, 'initial_data'), (
AssertionError: Cannot call `.is_valid()` as no `data=` keyword argument was passed when instantiating the serializer instance.
>>> serializer = LotNumberSerializer(data=None)
>>> serializer.is_valid()
False
>>> serializer = LotNumberSerializer(data="")
>>> serializer.is_valid()
False
>>> serializer = LotNumberSerializer(data={})
>>> serializer.is_valid()
True
>>> result = serializer.save()
>>> result
<LotNumber: 9KSDOQ>
yezyilomo commented 4 years ago

These two

lot_number = NestedField(LotNumberSerializer, required=False, allow_null=True)

And

lot_number = NestedField(LotNumberSerializer, required=False, default={})

Do things differently. The first one will set the value of lot_number field to null if you don't provide it when creating NestedInventorySerializer(which means it won't create LotNumber at all), the second one will create a lot number and populate it with with empty values, I think you are actually getting away with this because there are not required fields on LotNumber model, try to put one required field and you will see it throwing an error, you can just flip is_active = models.BooleanField(default=True) to is_active = models.BooleanField() so that is_active becomes required just to test.

yezyilomo commented 4 years ago

You can also check what's returned when you send a post request on NestedInventorySerializer using

lot_number = NestedField(LotNumberSerializer, required=False, allow_null=True)

And

lot_number = NestedField(LotNumberSerializer, required=False, default={})

The result of the first one should be

{
    "id": 1,
    "sku_code": "ABC-PRODUCT",
    "lot_number": null
}

but for your case I think save will override that null value and create lot number so there will be no null but the value of the generated lot number. And the result for the second one will be

{
    "id": 1,
    "sku_code": "ABC-PRODUCT",
    "lot_number": {
    "lot_number": "P519CK",
        "is_active": true,
        "created_at": "2020-03-18T12:12:39.943000Z",
        "updated_at": "2020-03-18T12:12:39.943000Z"
    }
}
yezyilomo commented 4 years ago

Also regarding this

That's interesting that allow_null is required when wrapping it in a NestedField but not needed without the wrapper.

We could make allow_null default to True if required=False is passed but that would mean we are always considering None a valid value which might not be true for some cases, so why not let the user decide for themselves.

resurrexi commented 4 years ago

Yea, I just got away with setting default={} because of the save overrides in both the LotNumber and Inventory models. I guess my head was so wrapped in knowing that a lot number would always be created, that I was simply trying to find a way to get it to work with NestedField.

What you have told me has been helpful. Thanks again!

yezyilomo commented 4 years ago

Thanks for raising this too cuz now I now that this part is very important to include in the documentation, You have also made a very good point here

That's interesting that allow_null is required when wrapping it in a NestedField but not needed without the wrapper.

A lot of people will expect it to just work with required=False only, like if only required=False is passed, it's very easy for anyone to guess that lot number will be set to null if you don't provide it during Inventory creation, assuming you were not creating it on save. So I think it's important to re-think about this, maybe it would be helpful to just make allow_null default to True if required=False is passed so that people won't need to pass allow_null=True.

yezyilomo commented 4 years ago

I would also like to know if this is what really happens when you use allow_null=True, I was not sure 100% about this one as we have no test covering this scenarion(overriding null on save).

but for your case I think save will override that null value and create lot number so there will be no null but the value of the generated lot number.

resurrexi commented 4 years ago

I would also like to know if this is what really happens when you use allow_null=True, I was not sure 100% about this one as we have no test covering this scenarion(overriding null on save).

Yes, it really happens. After I took your advice and used allow_null=True, it still created a lot number for me. As an added test, I explictly created my request body as:

{
    "sku_code": "ABC-PRODUCT",
    "lot_number": null
}

and got a response similar to:

{
    "id": 1,
    "sku_code": "ABC-PRODUCT",
    "lot_number": {
        "lot_number": "P519CK",
        "is_active": true,
        "created_at": "2020-03-18T12:12:39.943000Z",
        "updated_at": "2020-03-18T12:12:39.943000Z"
    }
}

So I think it's important to re-think about this, maybe it would be helpful to just make allow_null default to True if required=False is passed so that people won't need to pass allow_null=True.

My opinion is to not make that assumption, but rather to let users know the caveat of the NestedField, where it actually expects data to be passed to the serializer.

And now when I think about how DRF processed LotNumberSerializer without the NestedField wrapper, it treated the field as a read only field, as stated here, but was still able to create the Inventory instance because Inventory has an explicit save override to create the LotNumber instance if lot_number is None. Therefore, if the Inventory model didn't have the save override, the deserialization would probably have failed without the allow_null=True kwarg. I think I just lucked out because of the save override in my Inventory model.

Also, your NestedField makes the nested serializer as readable and writable.

yezyilomo commented 4 years ago

Yes, it really happens. After I took your advice and used allow_null=True, it still created a lot number for me. As an added test

That's great, thanks for the added test too.

My opinion is to not make that assumption, but rather to let users know the caveat of the NestedField, where it actually expects data to be passed to the serializer.

Yeah we'll definitely need to point this out on the documentation.

And now when I think about how DRF processed LotNumberSerializer without the NestedField wrapper, it treated the field as a read only field, as stated here, but was still able to create the Inventory instance because Inventory has an explicit save override to create the LotNumber instance if lot_number is None

Yeah looks like save was doing the work and not the serializer itself.

Also, your NestedField makes the nested serializer as readable and writable.

That's great.