model-bakers / model_bakery

Object factory for Django
https://model-bakery.readthedocs.io/en/latest/
Other
838 stars 85 forks source link

Calling prepare_recipe before make_recipe on a Recipe with a seq gives both records the same seq value #209

Open abbottc opened 3 years ago

abbottc commented 3 years ago

For a Recipe with a seq, if a test calls prepare_recipe then later calls make_recipe, then the seq will give the same value in both calls, rather than incrementing values. So, if that prepared instance is saved later during the test, it would produce an unexpected database integrity error if it were a unique field, for example.

This issue is due to this line of code (https://github.com/model-bakers/model_bakery/blob/b90253b3f68b3739c2b90d77eab74cb35b092e45/model_bakery/recipe.py#L54) in which Recipe._mapping() uses m.objects.count() == 0 as a test to see whether to reset the seq. I believe the assumption made there is that the absence of records means seq is used for the first time in a new test and therefore should reset. However, an absence of records can occur in other ways in the middle of a test where seq should continue incrementing, as evidenced by this example.

This issue follows from the discussion in #132 in which @berinhard clarified the purpose for the count() query in relation to a more minor issue I ran into. Thinking about that led me to consider this situation related to the same code which I think is less minor.

I suspect a solution would require implementing an entirely different means to determine when seq should reset, and from the explanation in #132 I understand that several other means for that were already considered and ruled out. If this issue is deemed sufficiently problematic enough to fix via implementing an alternative reset determination, I'm happy to discuss and possibly follow with a PR, although I don't currently have a ready solution in mind.

Expected behavior

The seq method should provide incremental values rather than the same value when using both prepare_recipe and make_recipe in a test.

Actual behavior

The seq method provides the same value.

Reproduction Steps

# models.py

class MyModel(models.Model):
    # unique constraint not strictly needed except to demonstrate where this behavior can lead to errors
    my_field = models.IntegerField(unique=True)

# baker_recipes.py

my_model = Recipe(MyModel, my_field=seq(1))

# tests.py

class MyTest(TestCase):
    def test_my_model(self):
        model1 = baker.prepare_recipe('myapp.my_model')
        model2 = baker.make_recipe('myapp.my_model')
        # This would fail
        self.assertNotEqual(model1.my_field, model2.my_field)
        # ...or this would produce a database integrity error (due to the unique constraint)
        model1.save()

Versions

Python: 3.9.5 Django: 3.2.4 Model Bakery: 1.3.2

abbottc commented 3 years ago

Having thought over this some more, I keep coming to the conclusion that a custom test runner is the best solution, although I know that option had been considered already and this database query was decided on instead. I would guess the reasoning against a test runner was around the desire to keep a minimal user configuration, and a test runner does require one line to be added to a settings.py file. So, I'll present some arguments for switching to the test runner and see how it goes.

What I propose is the project include a lightweight setup method which handles triggering the seq reset on each test completion (or initialization), and a teardown method may be needed if the trigger is implemented in a way requiring removal, plus a DiscoverRunner subclass which just calls those methods, either directly resetting seq from TextTestResult.startTest/stopTest or via creating a trigger (such as a post_migrate listener - considering Django emits that between tests) in DiscoverRunner.setup_test_environment/teardown_test_environment.

This project's setup instructions would then say if you use seq in a Recipe and need values reset for each test, then add Baker's DiscoverRunner to your settings.py, or if you already use a custom runner then run Baker's setup (and possibly teardown) method in it.

That extra config step should be minimal, and specifically only applies to a subset of users, since users may not need seq to reset, or may not use the feature, whereas those that do have a simple path to follow.

The major advantages of the test runner are:

I'm basing my discussion here on what I believe to be the historical reason the query exists, so I hope I've understood that correctly or I might be completely whistling in the dark here. Hopefully this provides some applicable tinder to spark discussion on the issue. Thanks for the time and attention on this.

timjklein36 commented 3 years ago

First of all, thank you for putting so much thought and detail into this issue.

I am a recently new maintainer to this project so I do not have much in the way of historical knowledge, but here are some of my thoughts on your proposal:

I would be in favor of at least seeing what one (or more) of the approaches listed above looks like and then we can decide how to document it so that it leaves it up to the user whether they want to do extra configuration to get the accuracy/performance gains. If you have some time to put together a PR, that would be great, otherwise, I can play around with something in the next few weeks.

Ultimately, I defer to the judgment of @amureki, @anapaulagomes, and @berinhard and I would like to hear what they think about this issue.

abbottc commented 3 years ago

Hi Tim,

Thanks for the detailed response, and thanks for being an open source maintainer.

Yes, I can write a PR for this. I'll wait though until hearing from the others you mentioned, in order to have a confirmed and agreed on plan to try before doing the work on it.

For the idea of providing a TestCase subclass and mixin as an alternative setup (likely for developers who only have a small number of affected tests), that concept was the motivation behind where I mentioned providing a lightweight setup method. Developers could work with such a method in any way that fits their needs, either calling it in their own TestRunner, calling it in the setUp/tearDown of specific TestCases they have, creating their own TestCase subclass or mixin with it, etc, etc. That seems to me the most flexible approach. While this project could provide all those specific things, it can lead to a path of having to maintain a lot of extra code for all the possibilities -- for example, you'd need a subclass for SimpleTestCase, TransactionTestCase, TestCase, and LiveServerTestCase (not just TestCase). I can create a PR with these extras if that's nonetheless preferred.

I'd like to encourage the idea of figuring out a solution to fully replace the query instead of merely providing an alternative to it and keeping the query around as a fallback. Allow me to try and be a little more persuasive on this point:

That being said, I'll create a PR in the manner desired, if it's decided to maintain the query as fallback. I imagine that would be implemented as a flag set by the lightweight setup method which would then be checked first before calling the query.

berinhard commented 3 years ago

Hey everyone, thanks a lot for such a rich discussion! I'm in favor of the strategies raised by @timgates42.

Despite that, I honestly don't see a problem in releasing a major version of model bakery considering the whole discussion. I mean, it'll be a significant design change because we'll be introducing mechanisms to control the test execution. Given that, I think we can deprecate the current query fallback in the next versions and remove it once we have this new support using test runner/test case/decorators.

Since we'll start to have more execution control of the tests. I think there's something else increase a little bit the scope: files cleaning. Currently baker only create files for FileField or ImageFieldif the user calls it with_create_files=True. This was designed that way, initially, to avoid writing unnecessary files in the users' drives. But, if the user set the flag to true, bakery does create the file, **but don't delete them**. This type of control can also be achieved via this newreset` mechanism you're suggesting.

Let me know what you think about this!