mmucklo / DtcQueueBundle

Symfony2/3/4/5 Queue Bundle (for background jobs) supporting Mongo (Doctrine ODM), Mysql (and any Doctrine ORM), RabbitMQ, Beanstalkd, Redis, and ... {write your own}
MIT License
120 stars 38 forks source link

MongoDuplicateKeyException E11000 duplicate key error collection: jobs.queue_run index: _id_ dup key: { : ObjectId('5b355cbc58004200011ed5a2') } #85

Closed kralos closed 6 years ago

kralos commented 6 years ago

mmucklo/DtcQueueBundle 4.7.1 symfony/symfony 3.4.12 doctrine/mongodb 1.6.2 doctrine/mongodb-odm 1.2.4 alcaeus/mongo-php-adapter 1.1.5 ext-mongodb 1.4.3 libbson 1.9.4 MongoDB 3.6.3 PHP 7.2.6

Not sure how this is happening yet but i've seen it in multiple environments / servers etc.

I'm starting my runs from crontab like this:

*/5 *   *   *   *       /my-project/bin/console -v dtc:queue:run -d 300 my-worker-1

Problem seems to be hard to re-produce.

Exception trace (happened at Jun 28, 2018 10:10:04 PM UTC):

MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: jobs.queue_run index: _id_ dup key: { : ObjectId('5b355cbc58004200011ed5a2') }
#21 my-project/vendor/mongodb/mongodb/src/Operation/InsertMany.php(142): executeBulkWrite
#20 my-project/vendor/mongodb/mongodb/src/Operation/InsertMany.php(142): execute
#19 my-project/vendor/mongodb/mongodb/src/Collection.php(776): insertMany
#18 my-project/vendor/alcaeus/mongo-php-adapter/lib/Mongo/MongoCollection.php(345): batchInsert
#17 my-project/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Collection.php(1005): doBatchInsert
#16 my-project/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Collection.php(141): batchInsert
#15 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/Persisters/DocumentPersister.php(257): executeInserts
#14 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/UnitOfWork.php(1143): executeInserts
#13 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/UnitOfWork.php(438): commit
#12 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/DocumentManager.php(542): flush
#11 my-project/vendor/mmucklo/queue-bundle/ODM/CommonTrait.php(46): persist
#10 my-project/vendor/mmucklo/queue-bundle/Doctrine/DoctrineRunManager.php(101): persistRun
#9 my-project/vendor/mmucklo/queue-bundle/Manager/RunManager.php(126): runStart
#8 my-project/vendor/mmucklo/queue-bundle/Run/Loop.php(124): runLoop
#7 my-project/vendor/mmucklo/queue-bundle/Command/RunCommand.php(132): execute
#6 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Command/Command.php(251): run
#5 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(964): doRunCommand
#4 my-project/vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(86): doRunCommand
#3 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(248): doRun
#2 my-project/vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(74): doRun
#1 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(148): run
#0 my-project/bin/console(27): null
MongoDuplicateKeyException: E11000 duplicate key error collection: jobs.queue_run index: _id_ dup key: { : ObjectId('5b355cbc58004200011ed5a2') }
#19 my-project/vendor/alcaeus/mongo-php-adapter/lib/Alcaeus/MongoDbAdapter/ExceptionConverter.php(83): toLegacy
#18 my-project/vendor/alcaeus/mongo-php-adapter/lib/Mongo/MongoCollection.php(348): batchInsert
#17 my-project/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Collection.php(1005): doBatchInsert
#16 my-project/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Collection.php(141): batchInsert
#15 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/Persisters/DocumentPersister.php(257): executeInserts
#14 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/UnitOfWork.php(1143): executeInserts
#13 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/UnitOfWork.php(438): commit
#12 my-project/vendor/doctrine/mongodb-odm/lib/Doctrine/ODM/MongoDB/DocumentManager.php(542): flush
#11 my-project/vendor/mmucklo/queue-bundle/ODM/CommonTrait.php(46): persist
#10 my-project/vendor/mmucklo/queue-bundle/Doctrine/DoctrineRunManager.php(101): persistRun
#9 my-project/vendor/mmucklo/queue-bundle/Manager/RunManager.php(126): runStart
#8 my-project/vendor/mmucklo/queue-bundle/Run/Loop.php(124): runLoop
#7 my-project/vendor/mmucklo/queue-bundle/Command/RunCommand.php(132): execute
#6 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Command/Command.php(251): run
#5 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(964): doRunCommand
#4 my-project/vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(86): doRunCommand
#3 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(248): doRun
#2 my-project/vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(74): doRun
#1 my-project/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(148): run
#0 my-project/bin/console(27): null

After the exception happened I looked at the Live Runs page:

Id Started At Duration Last Heartbeat At Max Count Processed Hostname Pid Process Timeout Current Job Id
5b355cbc58004200011ed5a2 2018-06-28T22:10:03+0000 300 2018-06-28T22:13:42+0000 5 ommited 1

Then the run got archived:

Id Ended At Elapsed Processed Hostname Pid
5b355cbc58004200011ed5a2 2018-06-28T22:15:03+0000 300.02460622787476 7 ommited 1
kralos commented 6 years ago

Okay I just found something very interesting, I located the jobs from run 5b355cbc58004200011ed5a2 and all of them have a different worker name to this exception. I suspect 2 runs fired with exactly the same id and 1 failed to start but another one worked. I'm actually running this in cron:

*/5 *   *   *   *       /my-project/bin/console -v dtc:queue:run -d 300 my-worker-1
*/5 *   *   *   *       /my-project/bin/console -v dtc:queue:run -d 300 my-worker-2
*/5 *   *   *   *       /my-project/bin/console -v dtc:queue:run -d 300 my-worker-3
*/5 *   *   *   *       /my-project/bin/console -v dtc:queue:run -d 300 my-worker-4

project / worker names obfuscated for privacy.

The above exception happened on my-worker-1 but ALL jobs were processed by my-worker-2. No other jobs were processed in the 300 second time window.

kralos commented 6 years ago

I suspect this issue is caused by http://php.net/manual/en/class.mongoid.php using purely time based Id generation.

kralos commented 6 years ago

See https://jira.mongodb.org/browse/PHP-166

Actually I just realised we're using alcaeus/mongo-php-adapter 1.1.5 so MongoId generation is handled by alcaeus/mongo-php-adapter, not the legacy driver...

DtcQueueBundle uses:

/**
 * @Grid\Column(sortable=true, order=1)
 * @ODM\Id
 */
protected $id;

Which in DoctrineODM means use AUTO - Uses the native generated MongoId.

DoctrineODM calls:

public function generate(DocumentManager $dm, $document)
{
    return new \MongoId();
}

Which alcaeus/mongo-php-adapter provides:

class MongoId implements Serializable, TypeInterface, JsonSerializable
{
    public function __construct($id = null)
    {
        $this->createObjectID($id);
    }
    /********/
    private function createObjectID($id)
    {
        try {
            if (is_string($id)) {
                $this->objectID = new ObjectID($id);
            } elseif ($id instanceof self || $id instanceof ObjectID) {
                $this->objectID = new ObjectID((string) $id);
            } else {
                $this->objectID = new ObjectID();
            }
        } catch (\Exception $e) {
            throw new MongoException('Invalid object ID', 19);
        }
    }
}

Since MongoId is constructed without a parameter, $this->objectID = new ObjectID(); is called.

Meaning the Id collision is happening inside ext-mongodb

ObjectId.c

static PHP_METHOD(ObjectId, __construct)
{
    php_phongo_objectid_t* intern;
    zend_error_handling    error_handling;
    char*                  id = NULL;
    phongo_zpp_char_len    id_len;

    zend_replace_error_handling(EH_THROW, phongo_exception_from_phongo_domain(PHONGO_ERROR_INVALID_ARGUMENT), &error_handling TSRMLS_CC);
    intern = Z_OBJECTID_OBJ_P(getThis());

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "|s!", &id, &id_len) == FAILURE) {
        zend_restore_error_handling(&error_handling TSRMLS_CC);
        return;
    }
    zend_restore_error_handling(&error_handling TSRMLS_CC);

    if (id) {
        php_phongo_objectid_init_from_hex_string(intern, id, id_len TSRMLS_CC);
    } else {
        php_phongo_objectid_init(intern);
    }
}

php_phongo_objectid_init(intern); is called when no params are passed.

static bool php_phongo_objectid_init(php_phongo_objectid_t* intern)
{
    bson_oid_t oid;

    intern->initialized = true;

    bson_oid_init(&oid, NULL);
    bson_oid_to_string(&oid, intern->oid);

    return true;
}

Id generation appears to happen in bson_oid_init in libbson

kralos commented 6 years ago

Proof of bug in or below mongodb/mongodb layer

test-lib-mongodb.php

#!/usr/bin/env php
<?php

// mongodb/mongodb
require __DIR__.'/vendor/autoload.php';

$client = new MongoDB\Client('mongodb://127.0.0.1');
$collection = $client->collisions->test;
$objectId = new MongoDB\BSON\ObjectId();
$collection->insertOne([
    '_id' => $objectId,
    'name' => 'mongodb/mongodb',
]);

crontab

*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php
*  *    *   *   *       /test/test-lib-mongodb.php

(There are 20 entries)

wait for the next minute

mongo cli

> use collisions
switched to db collisions
> db.test.find().count();
0
> db.test.find().count();
17

I have 3 emails from cron errors like this:

Fatal error: Uncaught MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: collisions.test index: _id_ dup key: { : ObjectId('5b3590c2cb6caa00012c2fa2') } in /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php on line 114

MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: collisions.test index: _id_ dup key: { : ObjectId('5b3590c2cb6caa00012c2fa2') } in /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php on line 114

Call Stack:
0.0001 389728 1. {main}() /test/test-lib-mongodb.php:0
0.1888 879936 2. MongoDB\Collection->insertOne() /test/test-lib-mongodb.php:12
0.2000 896712 3. MongoDB\Operation\InsertOne->execute() /test/vendor/mongodb/mongodb/src/Collection.php:799
0.2001 897392 4. MongoDB\Driver\Server->executeBulkWrite() /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php:114

PHP Fatal error: Uncaught MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: collisions.test index: _id_ dup key: { : ObjectId('5b3590c2cb6caa00012c2fa2') } in /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php:114
Stack trace:
#0 /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php(114): MongoDB\Driver\Server->executeBulkWrite('collisions.test', Object(MongoDB\Driver\BulkWrite), Array)
#1 /test/vendor/mongodb/mongodb/src/Collection.php(799): MongoDB\Operation\InsertOne->execute(Object(MongoDB\Driver\Server))
#2 /test/test-lib-mongodb.php(12): MongoDB\Collection->insertOne(Array)
#3 {main}
thrown in /test/vendor/mongodb/mongodb/src/Operation/InsertOne.php on line 114
kralos commented 6 years ago

I also tried this at the ext-mongodb layer but i cannot reproduce it.

#!/usr/bin/env php
<?php

// ext-mongodb
$manager = new MongoDB\Driver\Manager('mongodb://127.0.0.1');
$bulk = new MongoDB\Driver\BulkWrite();
$objectId = new MongoDB\BSON\ObjectId();
$bulk->insert([
    '_id' => $objectId,
    'name' => 'ext-mongodb',
]);
$manager->executeBulkWrite('collisions.test', $bulk);
kralos commented 6 years ago

Since I've proven this is not an issue in DtcQueueBundle I'm closing this ticket and re-opening on mongodb/mongodb

kralos commented 6 years ago

https://github.com/mongodb/mongo-php-library/issues/556