rsinger86 / drf-flex-fields

Dynamically set fields and expand nested resources in Django REST Framework serializers.
MIT License
740 stars 61 forks source link

Potential Denial of Service #126

Closed rdylanwalker closed 1 year ago

rdylanwalker commented 1 year ago

Issue: A "many" expansion does not appear to have a natural limitation to the count of returned items in the list. This can lead to a potential denial-of-service attack against a service.

For example, I may apply pagination on my List Serializer to show a maximum of 1000 entries. However, this list is related to another Model that allows expansion.

I have also confirmed that the viewset pagiation does not apply to expanded fields.

A user could call the expand on that "many" field and get back more than 1000 entries. If I have hundreds of thousands of related objects, it could generate a DOS.

Steps to reproduce:

  1. Setup a DRF + Flex Fields with two models - a parent and child:
    
    class Shelf(models.Model):
    class Meta:
        ordering = ["id"]
    title = models.TextField(default="test")

class Product(models.Model): title = models.CharField(max_length=255) description = models.TextField(max_length=500) price = models.IntegerField() type = models.CharField(max_length=255) visible = models.BooleanField(default=False) discount = models.IntegerField() created_at = models.DateTimeField(auto_now_add=True) updated_at = models.DateTimeField(auto_now=True) shelf = models.ForeignKey(Shelf, on_delete=models.PROTECT, null=True, blank=True)

def __str__(self):
    return self.title
2. Build associated serializers and viewsets:

class ShelfSerializer(FlexFieldsModelSerializer): class Meta: model = models.Shelf fields = ("id", "title", "product_set") expandable_fields = {"product_set": ("inventory.serializers.ProductSerializer", {"many": True})}

class ProductSerializer(serializers.ModelSerializer): class Meta: model = models.Product fields = ["id", "title", "description", "price", "type", "visible", "discount", "created_at", "updated_at", "shelf"] read_only_fields = ['id', ]


3. Create a "Shelf". 
4. Create as many "Product" as you want with "shelf" set to the id of the Shelf created in step 3. (run via script if you want)
5. Make a request to the Shelf viewset with expand=product_set
6. See that all products are returned, regardless of pagination settings, etc.

Proposed Solution:
Implement a sane default for maximum returns on nested fields (maybe 1000 or whatever pagination_class is set to), with ability to override passed in Meta.expandable_fields or settings.py
Caiofcas commented 1 year ago

This is a problem with DRF itself, as Flex fields does not interact with how the queryset is passed to the field for serialization, drf handles all that.

This is clear looking at the code, in ModelSerializer.build_relational_field:

def build_relational_field(self, field_name, relation_info):
    """
    Create fields for forward and reverse relationships.
    """
    field_class = self.serializer_related_field
    field_kwargs = get_relation_kwargs(field_name, relation_info)  # here

    to_field = field_kwargs.pop('to_field', None)
    if to_field and not relation_info.reverse and not relation_info.related_model._meta.get_field(to_field).primary_key:
        field_kwargs['slug_field'] = to_field
        field_class = self.serializer_related_to_field

    # `view_name` is only valid for hyperlinked relationships.
    if not issubclass(field_class, HyperlinkedRelatedField):
        field_kwargs.pop('view_name', None)

    return field_class, field_kwargs

We get the queryset to be passed to the field from the get_relation_kwargs function. Looking into it:

def get_relation_kwargs(field_name, relation_info):
    """
    Creates a default instance of a flat relational field.
    """
    model_field, related_model, to_many, to_field, has_through_model, reverse = relation_info
    kwargs = {
        'queryset': related_model._default_manager,
        'view_name': get_detail_view_name(related_model)
    }

    if to_many:
        kwargs['many'] = True

    if to_field:
        kwargs['to_field'] = to_field

    limit_choices_to = model_field and model_field.get_limit_choices_to()
    if limit_choices_to:
        if not isinstance(limit_choices_to, models.Q):
            limit_choices_to = models.Q(**limit_choices_to)
        kwargs['queryset'] = kwargs['queryset'].filter(limit_choices_to)

    if has_through_model:
        kwargs['read_only'] = True
        kwargs.pop('queryset', None)

    if model_field:
        if model_field.verbose_name and needs_label(model_field, field_name):
            kwargs['label'] = capfirst(model_field.verbose_name)
        help_text = model_field.help_text
        if help_text:
            kwargs['help_text'] = help_text
        if not model_field.editable:
            kwargs['read_only'] = True
            kwargs.pop('queryset', None)
        if kwargs.get('read_only', False):
            # If this field is read-only, then return early.
            # No further keyword arguments are valid.
            return kwargs

        if model_field.has_default() or model_field.blank or model_field.null:
            kwargs['required'] = False
        if model_field.null:
            kwargs['allow_null'] = True
        if model_field.validators:
            kwargs['validators'] = model_field.validators
        if getattr(model_field, 'unique', False):
            validator = UniqueValidator(queryset=model_field.model._default_manager)
            kwargs['validators'] = kwargs.get('validators', []) + [validator]
        if to_many and not model_field.blank:
            kwargs['allow_empty'] = False

    return kwargs

The relevant section is this:

    kwargs = {
        'queryset': related_model._default_manager,
        'view_name': get_detail_view_name(related_model)
    }

    if to_many:
        kwargs['many'] = True

This is then passed to a subclass of RelatedField, which has the following get_queryset method:

def get_queryset(self):
    queryset = self.queryset
    if isinstance(queryset, (QuerySet, Manager))
        queryset = queryset.all()
    return queryset

You can see that at no point you can limit the ammount of related objects returned in drf, so this should be raised with them. There are some SO questions relating to this and the solution seems to be to use a SerializerMethodField.

https://stackoverflow.com/questions/52808869/how-to-limit-the-number-of-objects-given-from-nested-serialization https://stackoverflow.com/questions/63321163/specifying-number-of-objects-to-serialize-drf

rdylanwalker commented 1 year ago

Thank you for the detailed response and examples. Closing and will pursue other avenues.