Closed dgelvin closed 10 years ago
This is because our oplog entry is twice the size of the document. Right now, our oplog entries for updates store the entire pre-image of the document, which is doubling the size of the oplog entry. That is a downside that we are working on improving as we speak. In the meantime, do you know how the document is changing, and if so, can you use modifiers to change it? Regardless, that should improve performance
On Tue, Aug 5, 2014 at 1:30 PM, dgelvin notifications@github.com wrote:
I am running a cluster of two sharded replica sets with TokuMX 1.5 on Ubuntu 14.04 x86_64. I am sharding on the hashed _id of the collections.
I am using the pymongo 2.7.2 driver to access the mongos router.
When I attempt to overwrite a 9mb document TokuMX appears to be doubling the document size, resulting in a 18mb document which fails to save:
doc = db.collection.find_one({'_id': '53e0...'}) db.collection.save(doc)
pymongo.errors.OperationFailure: BSONObj size: 18798961 (0x71D91E01) is invalid. Size must be between 0 and 16793600(16MB) First element: op: "u"
doc = db.collection.find_one({'_id': '53e0...'}) db.collection.update({'_id': _id}, doc)
pymongo.errors.OperationFailure: BSONObj size: 18798961 (0x71D91E01) is invalid. Size must be between 0 and 16793600(16MB) First element: op: "u"
However, if I simply retrieve the same 9mb document, then remove it, I am able to successfully save it:
doc = db.collection.find_one({'_id': '53e0...'}) db.collection.remove({'_id': doc['_id']}) db.collection.save(doc)
I am not actually modifying the document- simply reading it and attempting to replace it without modification. Obviously for my real use case I would be making a modification, but this issue is reproducible without making any changes to the document.
I created a JIRA issue https://jira.mongodb.org/browse/PYTHON-745 regarding this problem but they suggested this may be specific to TokuMX.
— Reply to this email directly or view it on GitHub https://github.com/Tokutek/mongo/issues/1190.
Thanks for the response- glad to understand what is happening. We will update our code to only save what is necessary.
I am running a cluster of two sharded replica sets with TokuMX 1.5 on Ubuntu 14.04 x86_64. I am sharding on the hashed _id of the collections.
I am using the
pymongo
2.7.2 driver to access the mongos router.When I attempt to overwrite a 9mb document TokuMX appears to be doubling the document size, resulting in a 18mb document which fails to save:
However, if I simply retrieve the same 9mb document, then remove it, I am able to successfully save it:
I am not actually modifying the document- simply reading it and attempting to replace it without modification. Obviously for my real use case I would be making a modification, but this issue is reproducible without making any changes to the document.
I created a JIRA issue regarding this problem but they suggested this may be specific to TokuMX.