Calling prepare_recipe before make_recipe on a Recipe with a seq gives both records the same seq value

For a Recipe with a seq, if a test calls prepare_recipe then later calls make_recipe, then the seq will give the same value in both calls, rather than incrementing values. So, if that prepared instance is saved later during the test, it would produce an unexpected database integrity error if it were a unique field, for example.

This issue is due to this line of code (https://github.com/model-bakers/model_bakery/blob/b90253b3f68b3739c2b90d77eab74cb35b092e45/model_bakery/recipe.py#L54) in which Recipe._mapping() uses m.objects.count() == 0 as a test to see whether to reset the seq. I believe the assumption made there is that the absence of records means seq is used for the first time in a new test and therefore should reset. However, an absence of records can occur in other ways in the middle of a test where seq should continue incrementing, as evidenced by this example.

This issue follows from the discussion in #132 in which @berinhard clarified the purpose for the count() query in relation to a more minor issue I ran into. Thinking about that led me to consider this situation related to the same code which I think is less minor.

I suspect a solution would require implementing an entirely different means to determine when seq should reset, and from the explanation in #132 I understand that several other means for that were already considered and ruled out. If this issue is deemed sufficiently problematic enough to fix via implementing an alternative reset determination, I'm happy to discuss and possibly follow with a PR, although I don't currently have a ready solution in mind.

Expected behavior

The seq method should provide incremental values rather than the same value when using both prepare_recipe and make_recipe in a test.

Actual behavior

The seq method provides the same value.

Reproduction Steps

# models.py

class MyModel(models.Model):
    # unique constraint not strictly needed except to demonstrate where this behavior can lead to errors
    my_field = models.IntegerField(unique=True)

# baker_recipes.py

my_model = Recipe(MyModel, my_field=seq(1))

# tests.py

class MyTest(TestCase):
    def test_my_model(self):
        model1 = baker.prepare_recipe('myapp.my_model')
        model2 = baker.make_recipe('myapp.my_model')
        # This would fail
        self.assertNotEqual(model1.my_field, model2.my_field)
        # ...or this would produce a database integrity error (due to the unique constraint)
        model1.save()

Versions

Python: 3.9.5 Django: 3.2.4 Model Bakery: 1.3.2

Having thought over this some more, I keep coming to the conclusion that a custom test runner is the best solution, although I know that option had been considered already and this database query was decided on instead. I would guess the reasoning against a test runner was around the desire to keep a minimal user configuration, and a test runner does require one line to be added to a settings.py file. So, I'll present some arguments for switching to the test runner and see how it goes.

What I propose is the project include a lightweight setup method which handles triggering the seq reset on each test completion (or initialization), and a teardown method may be needed if the trigger is implemented in a way requiring removal, plus a DiscoverRunner subclass which just calls those methods, either directly resetting seq from TextTestResult.startTest/stopTest or via creating a trigger (such as a post_migrate listener - considering Django emits that between tests) in DiscoverRunner.setup_test_environment/teardown_test_environment.

This project's setup instructions would then say if you use seq in a Recipe and need values reset for each test, then add Baker's DiscoverRunner to your settings.py, or if you already use a custom runner then run Baker's setup (and possibly teardown) method in it.

That extra config step should be minimal, and specifically only applies to a subset of users, since users may not need seq to reset, or may not use the feature, whereas those that do have a simple path to follow.

The major advantages of the test runner are:

Significantly improved robustness. The test runner will be able to explicitly and exactly do what's required to make this work, whereas the count query relies on equating a coincidental aspect that is only usually correlated with the correct trigger condition. I'd also emphasize that the issue described here is just one example of a set of conditions where the count query is an erroneous trigger. For instance, another issue would be any test that calls baker.make() before make_recipe() on the same model, where in that case the seq does not reset at all for that test.
Better adherence to testing principles. One of the larger concerns of a well-managed test suite is speed, particularly as projects get larger and total tests grow proportionally. And perhaps the largest contributing factor to test suite slowdowns for web applications is the accumulation of database hits. Any test touching the hard drive is automatically ~10-25x slower for SSD and ~100,000x slower for (God forbid) HDD. Carl Meyer, Gary Bernhardt, and many other prominent professionals have talked and written extensively about the importance of avoiding unnecessary queries in tests. A testing-related library may want to consider touching the database to be an inferior option when avoiding the database is possible.

I'm basing my discussion here on what I believe to be the historical reason the query exists, so I hope I've understood that correctly or I might be completely whistling in the dark here. Hopefully this provides some applicable tinder to spark discussion on the issue. Thanks for the time and attention on this.

First of all, thank you for putting so much thought and detail into this issue.

I am a recently new maintainer to this project so I do not have much in the way of historical knowledge, but here are some of my thoughts on your proposal:

I think that there could always be a "fallback" method that relies on a (admittedly non-ideal) database query so that the user is not forced to pre-configure a different test runner (in case they do not care or do not want to)
We could keep the code that determines when to "reset" the sequence as a top-level utility function that can be used in a number of different ways:
- The custom TestRunner, as you have described
- A custom django.test.TestCase subclass that uses the same methodology in setUp/tearDown (or otherwise)
- A mixin class with the same methodology in setUp/tearDown (for composition with user's potential existing TestCase subclass(es))

I would be in favor of at least seeing what one (or more) of the approaches listed above looks like and then we can decide how to document it so that it leaves it up to the user whether they want to do extra configuration to get the accuracy/performance gains. If you have some time to put together a PR, that would be great, otherwise, I can play around with something in the next few weeks.

Ultimately, I defer to the judgment of @amureki, @anapaulagomes, and @berinhard and I would like to hear what they think about this issue.

Hi Tim,

Thanks for the detailed response, and thanks for being an open source maintainer.

Yes, I can write a PR for this. I'll wait though until hearing from the others you mentioned, in order to have a confirmed and agreed on plan to try before doing the work on it.

For the idea of providing a TestCase subclass and mixin as an alternative setup (likely for developers who only have a small number of affected tests), that concept was the motivation behind where I mentioned providing a lightweight setup method. Developers could work with such a method in any way that fits their needs, either calling it in their own TestRunner, calling it in the setUp/tearDown of specific TestCases they have, creating their own TestCase subclass or mixin with it, etc, etc. That seems to me the most flexible approach. While this project could provide all those specific things, it can lead to a path of having to maintain a lot of extra code for all the possibilities -- for example, you'd need a subclass for SimpleTestCase, TransactionTestCase, TestCase, and LiveServerTestCase (not just TestCase). I can create a PR with these extras if that's nonetheless preferred.

I'd like to encourage the idea of figuring out a solution to fully replace the query instead of merely providing an alternative to it and keeping the query around as a fallback. Allow me to try and be a little more persuasive on this point:

Having the query exist guarantees buggy outcomes can exist. Bugs introduced by the query are especially pernicious since they're subtle, can result from a range of code situations that seem innocuous, and would be particularly hard to figure out. I would argue against leaving in code known to cause bugs, even if it allows for no config.
It's possible in certain situations for the query's effects to produce results which I think can reasonably be described as serious. For example, I described in my last comment one situation where the query would not reset seq at all in a new test. That means tests can exist which always produce false positives without the developer being aware. Imagine a test on a function in an accounting system that deletes invoices, where the test setup contained a condition causing the seq not to reset when creating invoice numbers, and then the test validated that an invoice number with the seq start value no longer exists after the function under test is called. Since the invoice values didn't get created from the starting seq value, the test would always pass and some change to the method could break an accounting system without warning. Allowing developers the option to fallback on code with such potential behavior seems to me unsafe and insecure.

That being said, I'll create a PR in the manner desired, if it's decided to maintain the query as fallback. I imagine that would be implemented as a flag set by the lightweight setup method which would then be checked first before calling the query.

Hey everyone, thanks a lot for such a rich discussion! I'm in favor of the strategies raised by @timgates42.

Despite that, I honestly don't see a problem in releasing a major version of model bakery considering the whole discussion. I mean, it'll be a significant design change because we'll be introducing mechanisms to control the test execution. Given that, I think we can deprecate the current query fallback in the next versions and remove it once we have this new support using test runner/test case/decorators.

Since we'll start to have more execution control of the tests. I think there's something else increase a little bit the scope: files cleaning. Currently baker only create files for FileField or ImageFieldif the user calls it with_create_files=True. This was designed that way, initially, to avoid writing unnecessary files in the users' drives. But, if the user set the flag to true, bakery does create the file, **but don't delete them**. This type of control can also be achieved via this newreset` mechanism you're suggesting.

Let me know what you think about this!

model-bakers / model_bakery