encode / django-rest-framework

Web APIs for Django. 🎸
https://www.django-rest-framework.org
Other
28.51k stars 6.85k forks source link

ModelSerializer uniqueness check crashes with OverflowError and that crash can not be prevented with field-level validation #7134

Open jamescooke opened 4 years ago

jamescooke commented 4 years ago

Checklist

Steps to reproduce

First: check Django behaviour

Given a simple Model with a single IntegerField, when working with SQLite (I'm running version 3.27 which is officially supported by Django) it is possible to generate an overflow error.

Model:

class OverflowModel(models.Model):
    value = models.IntegerField(unique=True)

Test:

    def test_model(self):
        OverflowModel(value=9223372036854775808).full_clean()

Gives an OverflowException:

tests/test_model_serializer.py:1320: in test_model
    OverflowModel(value=9223372036854775808).full_clean()
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/base.py:1217: in full_clean
    self.validate_unique(exclude=exclude)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/base.py:999: in validate_unique
    errors = self._perform_unique_checks(unique_checks)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/base.py:1103: in _perform_unique_checks
    if qs.exists():
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/query.py:777: in exists
    return self.query.has_results(using=self.db)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/query.py:537: in has_results
    return compiler.has_results()
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/compiler.py:1114: in has_results
    return bool(self.execute_sql(SINGLE))
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/compiler.py:1144: in execute_sql
    cursor.execute(sql, params)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:68: in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:77: in _execute_with_wrappers
    return executor(sql, params, many, context)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:86: in _execute
    return self.cursor.execute(sql, params)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py:396: in execute
    return Database.Cursor.execute(self, query, params)
E   OverflowError: Python int too large to convert to SQLite INTEGER

However, using MaxValueValidator allows us to protect SQLite from this overflow. So if we change the model and test, then Django raises a ValidationError complaining that value is too large:

Model:

class OverflowModel(models.Model):
    value = models.IntegerField(unique=True, validators=[
        MaxValueValidator(9223372036854775807),
    ])

Test:

    def test_model(self):
        with self.assertRaises(ValidationError):
            OverflowModel(value=9223372036854775808).full_clean()

In this way we can use Django to protect SQLite from the overflow - all is good at the Django level.

DRF behaviour

Now we make a ModelSerializer for this updated model with the MaxValueValidator in place and test it with the large int:

    def test(self):
        class TestSerializer(serializers.ModelSerializer):
            class Meta:
                model = OverflowModel
                fields = '__all__'

        serializer = TestSerializer(data={'value': 9223372036854775808})

        with self.assertRaises(serializers.ValidationError):
            serializer.is_valid()

Expected behaviour

This test should pass. The DRF model serializer should raise a ValidationError that complains that value is too large, something like:

rest_framework.exceptions.ValidationError: {'value': [ErrorDetail(string='Ensure this value is less than or equal to 9223372036854775807.', code='max_value')]}

Actual behaviour

However the test fails with an OverflowError which arises from the uniqueness check:

tests/test_model_serializer.py:1334: in test
    serializer.is_valid()
rest_framework/serializers.py:219: in is_valid
    self._validated_data = self.run_validation(self.initial_data)
rest_framework/serializers.py:418: in run_validation
    value = self.to_internal_value(data)
rest_framework/serializers.py:475: in to_internal_value
    validated_value = field.run_validation(primitive_value)
rest_framework/fields.py:567: in run_validation
    self.run_validators(value)
rest_framework/fields.py:589: in run_validators
    validator(value, self)
rest_framework/validators.py:73: in __call__
    if qs_exists(queryset):
rest_framework/validators.py:21: in qs_exists
    return queryset.exists()
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/query.py:777: in exists
    return self.query.has_results(using=self.db)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/query.py:537: in has_results
    return compiler.has_results()
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/compiler.py:1114: in has_results
    return bool(self.execute_sql(SINGLE))
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/models/sql/compiler.py:1144: in execute_sql
    cursor.execute(sql, params)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:68: in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:77: in _execute_with_wrappers
    return executor(sql, params, many, context)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/utils.py:86: in _execute
    return self.cursor.execute(sql, params)
.tox/venvs/py37-django30/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py:396: in execute
    return Database.Cursor.execute(self, query, params)
E   OverflowError: Python int too large to convert to SQLite INTEGER

Thoughts

(Ignore as appropriate)

It looks to me like the check for uniqueness happens before the field level validation that happens as a result of the MaxValueValidator because when I dump the field's validators I get:

[
    <UniqueValidator(queryset=OverflowModel.objects.all())>,
    <django.core.validators.MaxValueValidator object at 0x7f6743eff7f0>
]

This means that round-tripping the database happens before checking the max value. This is unexpected behaviour and I was surprised when I found it - I would expect as much validation as possible to take place before the database hit.

I also experimented with adding a field-level validate_value() function to the serializer, but it appears that run_validators() happens before that is trigged, so it's no help in preventing the overflow.

kevin-brown commented 4 years ago

This has to do with how we append the MaxValueValidator to the end of the list of validators if the max_value argument is passed in.

https://github.com/encode/django-rest-framework/blob/373e521f3685c48eb22e6fbb1d7079f161a4a67b/rest_framework/fields.py#L978-L979

This is especially unintuitive because we strip out the existing MaxValueValidator when we map the model field to the serializer field.

https://github.com/encode/django-rest-framework/blob/373e521f3685c48eb22e6fbb1d7079f161a4a67b/rest_framework/utils/field_mapping.py#L136-L145

I am not aware of the consequences of changing the location of where that validator is injected into the list of validators.

jamescooke commented 4 years ago

BTW I found this bug using https://github.com/kiwicom/schemathesis - definitely worth checking out if you're building any APIs that are described by Swagger / OpenAPI :+1:

rpkilby commented 4 years ago

My gut response.. I'm inclined to change how we handle uniqueness checks, and somewhat follow Django's lead. Looking at full_clean, fields are validated first and date/uniqueness checks later. I think one of the important distinctions here is that field validation is typically Python-only, while date/uniqueness checks execute ORM calls.

I'm thinking we could leave the UniqueValidator as-is so we don't break usage for existing declarative cases, but the ModelSerializer could be changed to instead add a uniqueness validator to the serializer's Meta.validators, similar to how we handle unique together. So, the generated ModelSerializer would go from something like:

class MySerializer(ModelSerializer):
    my_field = CharField(validators=[UniqueValidator(queryset=MyModel.objects.all())])

to

class MySerializer(ModelSerializer):
    class Meta:
        validators = [
            UniqueFieldValidator(queryset=MyModel.objects.all(), field_name='my_field'),
        ]

By moving the uniqueness check to the serializer from the field, this would ensure that the max value validator is checked first. And in general, I guess a general guideline should be field validators shouldn't touch the DB - that should ideally be reserved for serializer validators.

rpkilby commented 4 years ago

This may also just be an issue with the sqlite backend. https://code.djangoproject.com/ticket/27397

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.