MongoEngine / mongoengine

A Python Object-Document-Mapper for working with MongoDB
http://mongoengine.org
MIT License
4.22k stars 1.23k forks source link

Feature request: Automatic optimistic concurrency validation #1563

Open mikeckennedy opened 7 years ago

mikeckennedy commented 7 years ago

Hi guys,

I see the Document.save method takes a save_condition dict. With a very minor change (basically a new subclass of Document and a convention) this could automatically enforce and verify that it is safe to save the document.

I put together an example, but it's not as a PR because I'm not sure the best way to fit it into your object hierarchy. Basically, Document does not allow subclassing so I'm going to attach the sample but you'll need to move it into the package if you like it.

I propose a new class derived from Document, called OptimisticDocument:

The idea is that in addition to

id = mongoengine.ObjectIdField()

It also has a field

_doc_ver = mongoengine.UUIDField(binary=False)

Then anytime you save an OptimisticDocument, it uses that and updates it only if there are pending changes. Here's a possible implementation (not save_raw is the current Document.save)

def save(self, force_insert=False, validate=True, clean=True,
         write_concern=None, cascade=None, cascade_kwargs=None,
         _refs=None, save_condition=None, signal_kwargs=None, **kwargs):

    if not hasattr(self, '_changed_fields') or not self._changed_fields:
        return self._save_raw(force_insert, validate, clean,
                              write_concern, cascade, cascade_kwargs,
                              _refs, save_condition, signal_kwargs, **kwargs)

    print(self._changed_fields)
    op_save_condition = {'_doc_ver': str(self._doc_ver)}
    self._doc_ver = str(uuid.uuid4())
    print("Changed doc ver from {} to {}".format(op_save_condition.get('_doc_ver'), self._doc_ver))

    if save_condition:
        if not isinstance(save_condition, dict):
            raise ValueError("save_condition must be empty or a dictionary")
        save_condition.update(op_save_condition)
        op_save_condition = save_condition

    self._save_raw(force_insert, validate, clean,
                   write_concern, cascade, cascade_kwargs,
                   _refs, op_save_condition, signal_kwargs, **kwargs)

To see it in action, you can run these two attached files. But they are highly hacked due to the fact that I couldn't subclass Document.

  1. Run the code straight through, and it works fine.
  2. Run the code, stop at the input, change the underlying _doc_ver directly in the db, continue the program, see the save is stopped.

optimistic_document_example.zip

Thanks, Michael

wojcikstefan commented 7 years ago

Hi @mikeckennedy, thank you for writing this up. Your idea for dealing with multiple concurrent updates is pretty much the same as MongoDB's (see the docs, they just call the helper field a nonce and you call it _doc_ver).

You can accomplish this behavior by subclassing the Document and overriding its save method:

class NoConflictDoc(Document):
    name = StringField()
    _nonce = ObjectIdField()

    def save(self, *args, **kwargs):
        if not kwargs.get('save_condition'):
            kwargs['save_condition'] = {'_nonce': self._nonce}
            self._nonce = ObjectId()
        return super(NoConflictDoc, self).save(*args, **kwargs)

Now SaveConditionError will be raised any time you call save on an outdated document:

In [24]: ncd = NoConflictDoc.objects.create(name='John')

In [25]: ncd.name = 'Jane'

In [26]: ncd.save()
Out[26]: <NoConflictDoc: NoConflictDoc object>

In [27]: # In another shell:

In [28]: # ncd = NoConflictDoc.objects.get(name='Jane')

In [29]: # ncd.name = 'Jill'

In [30]: # ncd.save()

In [31]: ncd.name = 'Fail'

In [32]: ncd.save()
(...skipped the traceback for brevity...)
SaveConditionError: Race condition preventing document update detected

Unfortunately, that doesn't make the document completely foolproof because you can still introduce conflicting updates via update and modify. Perhaps we should introduce the save_condition param on those methods as well...

I'm not 100% convinced we should make NoConflictDoc part of the codebase (especially given that it doesn't cover all the bases), but perhaps we could make a tutorial about this. What do you think?

(cc @tsx for a second opinion)

mikeckennedy commented 7 years ago

Hi @wojcikstefan, thanks for the consideration on this feature.

It turns out this is much harder than you indicate in your steps unless I'm doing something wrong (possible, it's early).

Consider these two classes:

class OptimisticDocument(mongoengine.Document):
    _doc_ver = mongoengine.StringField(default=lambda: str(uuid.uuid4()))

class Record(OptimisticDocument):
    name = mongoengine.StringField()

Seems totally straightforward. But running this code results in an error:

ValueError: Document OptimisticDocument may not be subclassed

That's why I did all those copy / paste / adapt things.

As far as whether this is something that should be adopted given you can't easily 100% guard via update, I think it should be part of the library. One, because the "can't be subclassed error" and two just that it needs a strong warning that you must go through the doc to get the benefits + add a save_condition option to update.

This would help a lot of people do the right thing, at least most of the time. They could still drop to pymongo and skip the safety but having a safe option in the ODM seems like a solid feature to me.

wojcikstefan commented 7 years ago

Seems totally straightforward. But running this code results in an error: (...)

Hi @mikeckennedy, you just need to add meta = {'allow_inheritance': True} to the OptimisticDocument and it will work. See this for more information: http://docs.mongoengine.org/guide/defining-documents.html#document-inheritance.

That said, I consider your confusion a bug on MongoEngine's side, because the error message could be more helpful, for instance ValueError: Document OptimisticDocument may not be subclassed. To enable inheritance, change its allow_inheritance meta attribute to True.. Does that sound good to you?

mikeckennedy commented 7 years ago

That's great. Thanks, it was totally non-obvious how to do this, exactly because the error seemed like a pure-python error.

I still think adding this class with overridden save method and added field would be worthwhile, especially if you added a save_condition to update.

wojcikstefan commented 7 years ago

Agreed that it's worth exploring. We just need to make sure there aren't any edge cases/inconsistencies that aren't covered or at least documented well in this new document class. If it can handle all the scenarios well and doesn't break/negatively affect any of the existing methods, then I'm fine adding it. But if it's not generic enough, it might actually be better to leave it to the developer to create a subclass that covers their specific requirements/constraints.

Definitely happy to review any PRs!