Issues with django.forms.ModelChoiceField and ModelMultipleChoiceField

Hello, I have been investigating the feasibility of publishing a set of Hypothesis strategies and helpers for testing Wagtail projects. Wagtail is a CMS Library for Python, built on Django. I've been pleased with the results I've had using Hypothesis to test some of my other projects, and think there would be some benefit to integrating property-based testing into Wagtail projects (I am eager to find ways to improve the quality of software delivered to clients).

While working on this, I've come across some issues with the strategies that are generated for Django's ModelChoiceField, which Wagtail uses frequently, and ModelMultipleChoiceField.

The following test cases illustrate the issues.

import hypothesis
import pytest

from django import forms
from django.contrib.auth.models import User
from hypothesis import strategies as st
from hypothesis.extra.django import TestCase, from_field, from_form

class UserSelectForm(forms.Form):
    user = forms.ModelChoiceField(User.objects.all())
    users = forms.ModelMultipleChoiceField(User.objects.all())

@pytest.mark.django_db()
class TestModelChoicesField(TestCase):
    @classmethod
    def setUpTestData(cls) -> None:
        User.objects.create_user(
            username="foo-user", email="foo@user.com", password="not-so-secure"
        )
        User.objects.create_user(
            username="bar-user", email="bar@user.com", password="not-so-secure"
        )

    @hypothesis.given(
        # Must use deferred or get error for database access
        user=st.deferred(lambda: from_field(forms.ModelChoiceField(User.objects.all())))
    )
    def test_user(self, user):
        assert isinstance(
            user, forms.models.ModelChoiceIteratorValue
        )  # Expect a User instance
        assert isinstance(user.value, int)  # Inner value is an int (primary key)

    @hypothesis.given(
        # Must use deferred or get error for database access
        users=st.deferred(
            lambda: from_field(forms.ModelMultipleChoiceField(User.objects.all()))
        )
    )
    def test_users(self, users):
        assert isinstance(
            users, forms.models.ModelChoiceIteratorValue
        )  # Expect QuerySet[User]
        assert isinstance(users.value, int)

    # Deferred not required
    @hypothesis.given(form=from_form(UserSelectForm))
    def test_form(self, form):
        assert isinstance(form.data["user"], forms.models.ModelChoiceIteratorValue)
        assert not form.is_valid()
        assert form.errors == {
            "user": [
                "Select a valid choice. That choice is not one of the available choices."
            ],
            "users": ["Enter a list of values."],
        }

1. `from_field` with `ModelChoiceField` requires using a `deferred` strategy

This makes sense, as it looks like from_field generates a sampled_from(field.choices) strategy, which causes database access to evaluate the choices.

$ pytest -k 'ModelChoices and test_user' --reuse-db
======================================================= test session starts =======================================================platform linux -- Python 3.11.9, pytest-8.1.1, pluggy-1.5.0
django: version: 5.0.6, settings: wagtail_hypothesis.test.settings (from ini)
rootdir: /home/jmunn/projects/wagtail-hypothesis
configfile: pyproject.toml
plugins: xdist-3.6.1, hypothesis-6.103.0, cov-5.0.0, django-4.8.0
collected 27 items / 1 error / 27 deselected / 0 selected                                                                         

============================================================= ERRORS ==============================================================_______________________________________ ERROR collecting tests/test_model_choices_field.py ________________________________________tests/test_model_choices_field.py:16: in <module>
    class TestModelChoicesField(TestCase):
tests/test_model_choices_field.py:28: in TestModelChoicesField
    user=from_field(forms.ModelChoiceField(User.objects.all()))
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/hypothesis/extra/django/_fields.py:304: in from_field
    for value, name_or_optgroup in field.choices:
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/forms/models.py:1422: in __iter__
    for obj in queryset:
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/models/query.py:518: in _iterator
    yield from iterable
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/models/query.py:91: in __iter__
    results = compiler.execute_sql(
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/models/sql/compiler.py:1558: in execute_sql
    cursor = self.connection.chunked_cursor()
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/backends/base/base.py:670: in chunked_cursor
    return self.cursor()
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/utils/asyncio.py:26: in inner
    return func(*args, **kwargs)
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/backends/base/base.py:316: in cursor
    return self._cursor()
../../.virtualenvs/wagtail-hypothesis-dev/lib/python3.11/site-packages/django/db/backends/base/base.py:292: in _cursor
    self.ensure_connection()
E   RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.
===================================================== short test summary info =====================================================ERROR tests/test_model_choices_field.py - RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!================================================= 27 deselected, 1 error in 0.37s =================================================

A similar error is reported when using django's manage.py test test runner instead of pytest. This is a little confusing as the behaviour differs from using from_form in a given decorator, where it seems the field strategies are not eagerly evaluated. From my perspective, it would be ideal if from_field handled delaying the database access to the time that the first example is evaluated, which seems to be in a context where the database connection is available. The consistency between different field types would make this more user friendly.

2. Generated strategies yield incorrect types

The strategies generated for both ModelChoiceField and ModelMultipleChoiceField yield instances of forms.models.ModelChoiceIteratorValue, which are not valid values for a form — see the final test. I would expect either a model instance or a queryset, respectively, to be generated (or a primary key or list of primary keys — I need to double-check the order of validation and upcasting in the form instantiation and validation pathways). My expectation is that from_form should generate values that are valid for the defined form fields — is that assumption correct?

I am happy to submit a patch for these issues.

HypothesisWorks / hypothesis