FactoryBoy / factory_boy

A test fixtures replacement for Python
https://factoryboy.readthedocs.io/
MIT License
3.51k stars 396 forks source link

Allow SQLAlchemy factories to be committed on create #309

Closed zyskowsk closed 8 years ago

zyskowsk commented 8 years ago

It seems like the general consensus is that one should only flush the current session during a test case, and roll back on each tear down. Due to the nature of my testing setup it is more convenient to commit instead of flush on create. Have you considered adding a meta flag similar to force_flush that would allow the session to be committed?

for example:

class SQLAlchemyModelFactory(base.Factory):
    """Factory for SQLAlchemy models. """

    _options_class = SQLAlchemyOptions
    class Meta:
        abstract = True

    @classmethod
    def _create(cls, model_class, *args, **kwargs):
        """Create an instance of the model, and save it to the database."""
        session = cls._meta.sqlalchemy_session
        obj = model_class(*args, **kwargs)
        session.add(obj)
+       if cls._meta.force_commit:
+          session.commit()
        elif cls._meta.force_flush:
            session.flush()
        return obj

overriding SQLAlchemyModelFactory._create is my current solution. Any suggestions on how to properly have Factories be committed on creation is welcome! Thanks!

jeffwidman commented 8 years ago

I think this was solved by https://github.com/rbarrois/factory_boy/pull/262

I'm going to close this, but if for some reason you're looking for something different leave a comment and we can re-open.

zyskowsk commented 8 years ago

Hey @jeffwidman thanks for the quick response! enabling force_flush is not exactly what I am looking for. Let me try and better explain my situation, I may be missing something simple.

The situation I am running into is that I have server-side defaults on all of my models for created_at and updated_at. These defaults are set with db.func.now() in postgres, which is defined by the start time of the last transaction.

This means that if I only ever call session.flush(), even if I issue an INSERT followed by an UPDATE, on the same row, the updated_at value will remain the same until the current transaction is committed.

I am hoping to test that when I update an object through an api, the updated_at value changes. However, when I create a factory with force_flush enabled, and then issue an update on that row late, the updated_at column still reflects the start time of the open transaction.

For this reason I want Factory.create call to issue a commit on the session, not just a flush.

jeffwidman commented 8 years ago

Oh my bad and you even mentioned force_flush in your initial comment. Sorry about that.

I'm not opposed to adding a force_commit, and there may be times it's useful, but I'm not sure it's necessary for this particular situation situation. While the SQL standard says now() and current_timestamp() are the start of the transaction, PostgreSQL also supports the non-SQL-standard times clock_timestamp() and statement_timestamp() which do change during a transaction: https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

So if you call one of those, as long as the statement is flushed to the DB, it should get the actual timestamp as of right now. IIRC, you can call them from SQLAlchemy via db.func.statement_timestamp() and SQLAlchemy should pass the literal function name straight through.

Lastly, this is an aside from the actual issue but I'm curious how you're setting server-side defaults for updated_at? Unlike created_at, SQLAlchemy's implementation of updated_at as a server-side default doesn't actually modify the database, it only tells SQLAlchemy to not set the updated_at on commit because you've setup your DB somehow to handle it. I'm curious what route you went there?

zyskowsk commented 8 years ago

Hey! No worries. Thanks for the tip on clock_timestamp() and statement_timestamp(). I think I am still going to commit on factory create in the spirit of keeping my testing env as close to prod as possible. I try and open up a PR today and we can move the conversation over there.

As for my updated_at setup, i misspoke (or spoke lazily) I am simply using the SQLAlchemy standard onupdate=db.func.now().

jeffwidman commented 8 years ago

Sounds fine, I'm certainly open to the PR.

For the implementation, I'd like to deprecate the force_flush() API and instead switch to a forcing param that accepts False/`flush/commit, as it makes no sense to force a flush and a commit. It should default to False. I'm not sure what a good name is for the param though... thoughts?

At the next major version bump we can drop the force_flush(), which shouldn't impact that many folks as we just added it recently. That'll also be a good time to drop the deprecated fuzzy fixtures that we're instead going to source from faker (#271). I added this to the 3.0 milestone as a reminder to remove deprecated functionality.

rbarrois commented 8 years ago

@jeffwidman indeed, if we want to support multiple post-creation behaviors, a unique parameters seems to be the way to go.

What about sqlalchemy_session_mode = 'donothing' | 'flush' | 'commit'? I'd rather avoid mixed types (False vs 'flush').

zyskowsk commented 8 years ago

Hey y'all!

I agree, moving to a single flag makes a lot of sense. I propose sqlalchemy_session_action = None | 'flush' | 'commit', very similar to what @rbarroios proposed, except that I don't feel terrible about mixing types if the default type is None. It feels strange to have a default string type that is never used.

Also definitely excited that you plan on deprecating fuzzy fixtures and to rely on Faker!

I just opened a pr at #310. I suggest we move this conversation over there.

jeffwidman commented 8 years ago

What about sqlalchemy_session_persistence = None | 'flush' | 'commit'?

Agree that it's generally weird to mix types, but None feels right for the default... my original suggestion of False is a bad idea.

_persistence seems more descriptive than _mode or _action, although I don't have a strong opinion either way.

FYI: Let's keep the general design discussion here in the issue. Comments on the specific code implementation can go in the PR.

Sometimes in the world of open source PRs get abandoned and since they can't be re-assigned to another user, we as maintainers have to close them eventually... it's annoying when another person comes along to fix later to say "please look at this issue, plus the other notes over in this rejected PR". Better thing is keep the general design discussion in the issue, then everything is in one place for someone else to take a stab at later if necessary.

zyskowsk commented 8 years ago

Good point on keeping the conversation isolated in one location, I like that idea a lot. I also agree that _persistance is more descriptive. My vote goes for

sqlalchemy_session_persistence = None | 'flush' | 'commit'.

@rbarrois do you have a strong opinion between _persistance, _mode, _action ?

antoine-lizee commented 8 years ago

I just want to confirm that this will end up changing the behavior of all the MyFactory(...) calls as it is by default a redirection to MyFactory.create(...).

jeffwidman commented 8 years ago

Yes, the default strategy for MyFactory(...) is create(), so whatever you set as the sqlalchemy_session_persistence (or whatever the arg gets named) will affect MyFactory(...) calls. However, the default for sqlalchemy_session_persistence will be to not persist, so effectively the default behavior for MyFactory(...) calls does not change.

zyskowsk commented 8 years ago

Exactly, the default persistence strategy will not be changing, so honestly this doesn't change the behavior of constructor calls. Of course anyone relying on 'MyFactory(...)' to flush the session via 'force_flush' will have to switch to 'sqlalchemy_session_persistence' (or what ever we settle on) but other than that this change shouldn't change any existing behavior

antoine-lizee commented 8 years ago

Just wanted to point it out - I'm totally fine with it. We're actually using build() as the default strategy so the behavior is closer to the real object constructor anyway.

rbarrois commented 8 years ago

@zyskowsk it seems that persistence provides the best description of this!

Regarding the deprecation path, what do you think about the following plan:

Beyond this, I'm unsure as to the impact on .build() / .create(): if the default is kept at None (i.e "do nothing"), then:

Final question: I'm not familiar with SQLAlchemy; would users want to set that persistence=commit flag globally instead of a per-factory basis?

jeffwidman commented 8 years ago

would users want to set that persistence=commit flag globally instead of a per-factory basis?

Yes, definitely. There might be some edge cases where you only want to set it on a specific factory, or override the global setting in a specific factory, but most common case you'll want to set it globally.

Interested in hearing other opinions on this.

zyskowsk commented 8 years ago

Hey y'all! Sorry i got side tracked for a bit there. @rbarrois as @jeffwidman mentioned, yes it is very useful to be able to set persistence=commit globally. Personally, I would concerned if i were working in a testing environnent that set session level persistance behavior on a per factory basis, that seems scary (but I agree with @jeffwidman that there may be some edge cases so we should support it). Currently we have a base factory class that has this behavior which all other factories inherit.

I plan on updating the PRq in the next few days so we can get this thing merged.

zyskowsk commented 8 years ago

Yo, I just pushed an update on #310 that uses force_flush if true and issues a DeprecationWarning. I looked around and it seemed to me that we don't have a well defined pattern for setting global state within an app using factory_boy, so I left out a way to set sqlalchemy_session_persistence globally until we have a conversation about global options setting in general.

Personally I actually don't mind creating a base class for my app and setting global options there, but if we were to start supporting a more standard way of setting global options I would be in favor of having a config file that factory_boy reads from. I am not sure exactly what this would look like though. Are there other option values that would make sense to set globally across multiple factories? I also may be missing something here!

If there are more options that would be helpful to set globally, and there is not a current way of doing this I vote for doing that all at once and making it a new PR.

rbarrois commented 8 years ago

@zyskowsk thanks for the update, it looks pretty well!

I don't mind handling the deprecation path myself, it's quite complex to get warnings to show up in the proper place ;)

As for the global option, it would be interesting to have such a mechanism — we already have a custom way to prime Fuzzy's random seed, and more configuration would be required for the auto_factory branch, so that's definitely something we need on some roadmap.

However, it's not strictly required for this feature, although a "nice to have" :)

So, I'm :+1: on merging this, and including it in the next release — let's plan for "in the next few weeks"?

zyskowsk commented 8 years ago

Awesome! Thanks @rbarrois, i just fixed a few small things on the branch. Are you going to commit the warning changes to the same branch? I would love to follow along

rbarrois commented 8 years ago

@zyskowsk I've added the warnings change; but messed up with github's UI so the changes arrived in https://github.com/zyskowsk/factory_boy/pull/1 ...

I also improved the mocking code, based on https://docs.python.org/3/library/unittest.mock.html#auto-speccing.

Let me know what you think of it :-)

zyskowsk commented 8 years ago

@rbarrios, looks great to me, thanks for the improvements, I merged your changes into my branch

aalvrz commented 3 years ago

@zyskowsk I've been following this thread because I am experiencing a similar issue:

class MyModel(Base):
    created_at = Column(DateTime, nullable=False, default=datetime.utcnow)

class MyFactory(factory.alchemy.SQLAlchemyModelFactory):
    class Meta:
        model = MyModel
        sqlalchemy_session_persistence = "commit"

def my_test():
    a = MyFactory()
    b = MyFactory()
    c = MyFactory()

    assert not a.created_at == b.created_at  # Fails

All those objects will have the same value for created_at. I am using commit for sqlalchemy_session_persistence and the assertion fails. However if I use flush then the datetimes will be different. Not realy sure what strategy I should use? From what I understand in this thread, your intention was to use commit strategy.