Open abbottc opened 3 years ago
Having thought over this some more, I keep coming to the conclusion that a custom test runner is the best solution, although I know that option had been considered already and this database query was decided on instead. I would guess the reasoning against a test runner was around the desire to keep a minimal user configuration, and a test runner does require one line to be added to a settings.py file. So, I'll present some arguments for switching to the test runner and see how it goes.
What I propose is the project include a lightweight setup method which handles triggering the seq reset on each test completion (or initialization), and a teardown method may be needed if the trigger is implemented in a way requiring removal, plus a DiscoverRunner subclass which just calls those methods, either directly resetting seq from TextTestResult.startTest/stopTest or via creating a trigger (such as a post_migrate listener - considering Django emits that between tests) in DiscoverRunner.setup_test_environment/teardown_test_environment.
This project's setup instructions would then say if you use seq in a Recipe and need values reset for each test, then add Baker's DiscoverRunner to your settings.py, or if you already use a custom runner then run Baker's setup (and possibly teardown) method in it.
That extra config step should be minimal, and specifically only applies to a subset of users, since users may not need seq to reset, or may not use the feature, whereas those that do have a simple path to follow.
The major advantages of the test runner are:
I'm basing my discussion here on what I believe to be the historical reason the query exists, so I hope I've understood that correctly or I might be completely whistling in the dark here. Hopefully this provides some applicable tinder to spark discussion on the issue. Thanks for the time and attention on this.
First of all, thank you for putting so much thought and detail into this issue.
I am a recently new maintainer to this project so I do not have much in the way of historical knowledge, but here are some of my thoughts on your proposal:
django.test.TestCase
subclass that uses the same methodology in setUp
/tearDown
(or otherwise)setUp
/tearDown
(for composition with user's potential existing TestCase
subclass(es))I would be in favor of at least seeing what one (or more) of the approaches listed above looks like and then we can decide how to document it so that it leaves it up to the user whether they want to do extra configuration to get the accuracy/performance gains. If you have some time to put together a PR, that would be great, otherwise, I can play around with something in the next few weeks.
Ultimately, I defer to the judgment of @amureki, @anapaulagomes, and @berinhard and I would like to hear what they think about this issue.
Hi Tim,
Thanks for the detailed response, and thanks for being an open source maintainer.
Yes, I can write a PR for this. I'll wait though until hearing from the others you mentioned, in order to have a confirmed and agreed on plan to try before doing the work on it.
For the idea of providing a TestCase subclass and mixin as an alternative setup (likely for developers who only have a small number of affected tests), that concept was the motivation behind where I mentioned providing a lightweight setup method. Developers could work with such a method in any way that fits their needs, either calling it in their own TestRunner, calling it in the setUp/tearDown of specific TestCases they have, creating their own TestCase subclass or mixin with it, etc, etc. That seems to me the most flexible approach. While this project could provide all those specific things, it can lead to a path of having to maintain a lot of extra code for all the possibilities -- for example, you'd need a subclass for SimpleTestCase, TransactionTestCase, TestCase, and LiveServerTestCase (not just TestCase). I can create a PR with these extras if that's nonetheless preferred.
I'd like to encourage the idea of figuring out a solution to fully replace the query instead of merely providing an alternative to it and keeping the query around as a fallback. Allow me to try and be a little more persuasive on this point:
Having the query exist guarantees buggy outcomes can exist. Bugs introduced by the query are especially pernicious since they're subtle, can result from a range of code situations that seem innocuous, and would be particularly hard to figure out. I would argue against leaving in code known to cause bugs, even if it allows for no config.
It's possible in certain situations for the query's effects to produce results which I think can reasonably be described as serious. For example, I described in my last comment one situation where the query would not reset seq at all in a new test. That means tests can exist which always produce false positives without the developer being aware. Imagine a test on a function in an accounting system that deletes invoices, where the test setup contained a condition causing the seq not to reset when creating invoice numbers, and then the test validated that an invoice number with the seq start value no longer exists after the function under test is called. Since the invoice values didn't get created from the starting seq value, the test would always pass and some change to the method could break an accounting system without warning. Allowing developers the option to fallback on code with such potential behavior seems to me unsafe and insecure.
That being said, I'll create a PR in the manner desired, if it's decided to maintain the query as fallback. I imagine that would be implemented as a flag set by the lightweight setup method which would then be checked first before calling the query.
Hey everyone, thanks a lot for such a rich discussion! I'm in favor of the strategies raised by @timgates42.
Despite that, I honestly don't see a problem in releasing a major version of model bakery considering the whole discussion. I mean, it'll be a significant design change because we'll be introducing mechanisms to control the test execution. Given that, I think we can deprecate the current query fallback in the next versions and remove it once we have this new support using test runner/test case/decorators.
Since we'll start to have more execution control of the tests. I think there's something else increase a little bit the scope: files cleaning. Currently baker only create files for FileField
or ImageFieldif the user calls it with
_create_files=True. This was designed that way, initially, to avoid writing unnecessary files in the users' drives. But, if the user set the flag to true, bakery does create the file, **but don't delete them**. This type of control can also be achieved via this new
reset` mechanism you're suggesting.
Let me know what you think about this!
For a
Recipe
with aseq
, if a test callsprepare_recipe
then later callsmake_recipe
, then theseq
will give the same value in both calls, rather than incrementing values. So, if that prepared instance is saved later during the test, it would produce an unexpected database integrity error if it were a unique field, for example.This issue is due to this line of code (https://github.com/model-bakers/model_bakery/blob/b90253b3f68b3739c2b90d77eab74cb35b092e45/model_bakery/recipe.py#L54) in which
Recipe._mapping()
usesm.objects.count() == 0
as a test to see whether to reset the seq. I believe the assumption made there is that the absence of records means seq is used for the first time in a new test and therefore should reset. However, an absence of records can occur in other ways in the middle of a test where seq should continue incrementing, as evidenced by this example.This issue follows from the discussion in #132 in which @berinhard clarified the purpose for the
count()
query in relation to a more minor issue I ran into. Thinking about that led me to consider this situation related to the same code which I think is less minor.I suspect a solution would require implementing an entirely different means to determine when
seq
should reset, and from the explanation in #132 I understand that several other means for that were already considered and ruled out. If this issue is deemed sufficiently problematic enough to fix via implementing an alternative reset determination, I'm happy to discuss and possibly follow with a PR, although I don't currently have a ready solution in mind.Expected behavior
The seq method should provide incremental values rather than the same value when using both
prepare_recipe
andmake_recipe
in a test.Actual behavior
The seq method provides the same value.
Reproduction Steps
Versions
Python: 3.9.5 Django: 3.2.4 Model Bakery: 1.3.2