MongoEngine / mongoengine

A Python Object-Document-Mapper for working with MongoDB
http://mongoengine.org
MIT License
4.22k stars 1.23k forks source link

Could really use doc_instance.save() with a precondition #460

Closed michaelbartnett closed 10 years ago

michaelbartnett commented 11 years ago

In my current project I find myself needing to update a complex document. Calculating the update deltas myself would be a pain and I like to rely on MongoEngine's for knowing how/when to pass the shard key along, so the Document instance interface is proving to be helpful for this. But when I save this document instance I need to include my own selection query to ensure that a last_changed timestamp hasn't changed between the read and the update.

What do you think about adding a precondition kwarg to Document.save()? Or even a different save_with_precondition function. Or better yet, is there already a straightforward and easy way to do this sort of save/update?

I'm experimenting with my own conditional_save function that is a direct copy of the save function but with the following changes:

    # Line 180 in mongoengine/document.py
    def conditional_save(self, force_insert=False, validate=True, clean=True,
             write_concern=None,  cascade=None, cascade_kwargs=None,
             _refs=None, precondition=None, **kwargs):
    # ...
    # Line 234
        try:
            collection = self._get_collection()
            if created:
                if force_insert:
                    object_id = collection.insert(doc, **write_concern)
                else:
                    object_id = collection.save(doc, **write_concern)
            else:
                object_id = doc['_id']
                updates, removals = self._delta()
                select_dict = precondition or {}
                # Need to add shard key to query, or you get an error
                select_dict.update(_id=object_id)
    # ...
    # Line 288
                if updates or removals:
                    do_upsert = precondition is None
                    last_error = collection.update(select_dict, update_query,
                                                   upsert=do_upsert, **write_concern)
                    created = is_new_object(last_error)
                    if last_error.get('n', None) != 1:
                        raise self.__class__.DoesNotExist()

Usage would generally look like this:

class MyDocument(Document):
    value = IntField(default=0)
    last_changed = DateTimeField(default=utcnow)

doc_inst = MyDocument.objects.limit(1).get()
doc_inst.value += 1
check = {'last_changed': doc_inst.last_changed}
try:
    doc_inst.conditional_save(precondition=check)
except MyDocument.DoesNotExist:
    print('Failed the precondition')
else:
    print('Successfully saved')

I'm raising the DoesNotExist exception since that is technically what is happening and to avoid breaking the chaining behavior of save(). This doesn't feel quite right, API design-wise, but aside from adding another more specific exception I'm not sure how else to report the precondition failing. There's also the fact that the upsert kwarg now depends on precondition's value.

Does this make sense as a feature and as an approach for implementing it? I'm guessing it would make more sense as an extended Document.update() function, but that doesn't immediately have the delta calculation working without interference from me.

rozza commented 10 years ago

Is this more for versioning? Or a method to stop conflicts / race conditions when saving? Not sure how best to tackle this yet.

michaelbartnett commented 10 years ago

My immediate need is for stopping conflicts and race conditions.

michaelbartnett commented 10 years ago

I don't know if this is helpful or just noise, but here's a more thorough description:

The use case is a backend for a game that stores player data in a single document, including weapons that were purchased, and upgrades applied to them.

ie:

// Player document
{
    "_id": ObjectId,
    "last_update": DateTime,
    "money": 42,
    "gem_inventory": [
        { "dmg_bonus": 2, "lightning_resist": 9, name: "Gregarious Gahnospinel" }
    ],
    "weapons": {
        // Weapons are an object because weapons are unlocked by spending "money" and put
        // into this dict. A player can only have one of each weapon, identified by string name.
        "lightsaber": {
            // Gems don't exist as separate documents, they just get moved between
            // the gem_inventory and the gem sockets on weapons.
            "gems": [
                { "dmg_bonus": 4, "ice_resist": 2, name: "Midichlorian Malachite" },
                { "dmg_bonus": 1, "fire_resist": 3, name: "Pearl of Pythonic Punishment" },
            ],
            "max_gems": 2
        }
    }
}

The hope is to process weapon unlock/gem-socketing requests by manipulating a local document instance and then saving that back to the DB. It has to be able to support buying another weapons, say, bamboo_stick, buying another gem, and adding that new gem and the Gregarious Gahnospinel already in the inventory to the '.weapons.bamboo_stick.gems` array.

In order to make that work, I need to be able to save only on the condition that last_update hasn't changed since the document was read from mongo. If it has changed, the case is usually somebody trying to dupe gems or get around paying for weapons.

This example probably isn't too hard to work out with document.update(), but for the real code, constructing an update query dict would look an awful lot like I was reimplementing _delta() and save().

boostbob commented 10 years ago

Is this similar to pymongo#find_and_modify(mongodb#findAndModify)?

michaelbartnett commented 10 years ago

Similar, but in my use case the condition for modification is a little more complex than can be expressed by a mongo query, so I need to read the doc in and process it.

The big idea is letting MongoEngine handle generating the update dict, which would be pretty useful with a find_and_modify call too, especially if the mongoengine.Document instance could be updated from the result of that call.

boostbob commented 10 years ago

@michaelbartnett Do you mean that you will provide a function(maybe javascript function) to the update method?

michaelbartnett commented 10 years ago

@boostbob No, this is all in python land. The code generally looks like this:

request_args = get_request_args()

# Read doc from database
mydoc = MyDocumentType.objects.get(id=request_args.id)
last_changed_from_previous_read = mydoc.last_changed

# Heavily modify the mongoengine.Document instance
mydoc.this = that()
mydoc.things = [process(t) for t in mydoc.otherthings]
do_more_stuff_to_doc(mydoc)
mydoc.last_changed = datetime.datetime.utcnow()

# Try to save mydoc to database, but if it has been changed in the meantime, that request should fail
try:
    mydoc.save_with_condition(last_changed=last_changed_from_previous_read)
except UpdateConditionFailed:
    send_response("Error: simultaneous doc update")
else:
    send_response("Success")

Going off of my reading of MongoEngine's source, mydoc.save() would create a query dict based on the DBRef of mydoc and then calculate the update dict to pass to collection#update. I want to specify additional conditions in the query dict created in mydoc.save(), specifically, last_changed=last_changed_from_previous_read.

I'm not a MongoDB expert, so I apologize for the rambling descriptions while trying to communicate the issue.

boostbob commented 10 years ago

@michaelbartnett Thank you for your explain, i understand you, there is no transaction, no row lock. I am agree with you.

FrankBattaglia commented 10 years ago

I did a search but somehow missed this issue when I opened #511, but I think we're talking about the same thing.

michaelbartnett commented 10 years ago

Yup, we are talking about the same thing. Thanks for making a patch for this, @FrankSomething. Closing since you covered it in the other issue.