alibaba / MongoShake

MongoShake is a universal data replication platform based on MongoDB's oplog. Redundant replication and active-active replication are two most important functions. 基于mongodb oplog的集群复制工具,可以满足迁移和同步的需求,进一步实现灾备和多活功能。
GNU General Public License v3.0
1.71k stars 442 forks source link

Keep looping on error #258

Closed kinesra75 closed 4 years ago

kinesra75 commented 4 years ago

Hi,

Mongoshake report me that error, loop on it, but the docs from both side (master/slave) are the same on those particular keys.I got that error so the sync doesn't move forward it get stuck with that.

[14:28:16 CET 2019/11/06] [CRIT] (mongoshake/executor.(Executor).execute:97) Replayer-1, executor-1, oplog for namespace[ARIANELAB.user_history] op[u] failed. error type[errors.errorString] error[doUpdate run upsert/update[true] failed[Updating the path 'b.0.mhc' would create a conflict at 'b.0.mhc']], logs number[256], firstLog: [{ts 6755765458168709606} {op u} {g } {ns ARIANELAB.user_history} {o [{$set [{b.0.mhc 3} {b.0.mjcdt 2019-11-05 10:20:04 +0000 UTC} {gmhc 3} {gmjcdt 2019-11-05 10:20:04 +0000 UTC}]} {$set [{b.0.mhc 3} {b.0.mjcdt 2019-11-05 10:20:04 +0000 UTC} {gmhc 3} {gmjcdt 2019-11-05 10:20:04 +0000 UTC}]}]} {o2 map[_id:ObjectIdHex("5bf2881fa4efa2c20838b369")]} {uk map[]} {lsid } {fromMigrate false}]

The master is 3.6.11 and "slave" 4.0.9 and mongoshake is 2.0.8

Do you know where it could come from ?

Thanks, Seb.

vinllen commented 4 years ago

What's your source and target MongoDB version?

kinesra75 commented 4 years ago

The master is 3.6.11 and "slave" 4.0.9 and mongoshake is 2.0.8

vinllen commented 4 years ago

Could u show me this raw oplog in local.oplog.rs?

use local
db.oplog.rs.find({"o2": {"_id": ObjectId("5bf2881fa4efa2c20838b369")}})
kinesra75 commented 4 years ago

This is the oplog

{ 
    "ts" : Timestamp(1572949220, 486), 
    "t" : NumberLong(3), 
    "h" : NumberLong(9151351386668080369), 
    "v" : NumberInt(2), 
    "op" : "u", 
    "ns" : "ARIANELAB.user_history", 
    "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), 
    "o2" : {
        "_id" : ObjectId("5bf2881fa4efa2c20838b369")
    }, 
    "wall" : ISODate("2019-11-05T11:20:20.941+0100"), 
    "o" : {
        "$v" : NumberInt(1), 
        "$set" : {
            "b.0.mhc" : NumberInt(3), 
            "b.0.mjcdt" : ISODate("2019-11-05T11:20:04.000+0100"), 
            "gmhc" : NumberInt(3), 
            "gmjcdt" : ISODate("2019-11-05T11:20:04.000+0100")
        }
    }
}
vinllen commented 4 years ago

Is there only one oplog? It looks like what you gave is not the same as oplog in the previous error log.

vinllen commented 4 years ago

In your previous oplog, there're two $set in the array in "o" field which looks like an aggregation pipeline. However, this feature is released in MongoDB 4.2. ---------update. I've tried several ways but can't produce an oplog with two $set, could you tell me how to produce this oplog?

kinesra75 commented 4 years ago

Hi Chen, This is the complete oplog with _id : 5bf2881fa4efa2c20838b369

{ "ts" : Timestamp(1572949220, 418), "t" : NumberLong(3), "h" : NumberLong(-8594416531214545863), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-05T11:20:20.931+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.0.mhc" : NumberInt(2), "b.0.mjcdt" : ISODate("2019-11-05T11:20:08.000+0100"), "gmhc" : NumberInt(2), "gmjcdt" : ISODate("2019-11-05T11:20:08.000+0100") } } } { "ts" : Timestamp(1572949220, 486), "t" : NumberLong(3), "h" : NumberLong(9151351386668080369), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-05T11:20:20.941+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.0.mhc" : NumberInt(3), "b.0.mjcdt" : ISODate("2019-11-05T11:20:04.000+0100"), "gmhc" : NumberInt(3), "gmjcdt" : ISODate("2019-11-05T11:20:04.000+0100") } } } { "ts" : Timestamp(1572972598, 127), "t" : NumberLong(3), "h" : NumberLong(1459695621046784463), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-05T17:49:58.467+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.788" : { "csid" : NumberInt(619410), "cid" : NumberInt(111018), "prgid" : NumberInt(600), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-05T17:49:50.882+0100"), "edt" : ISODate("2019-11-08T17:49:50.882+0100") } } } } { "ts" : Timestamp(1572976889, 643), "t" : NumberLong(3), "h" : NumberLong(2688667894715830063), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-05T19:01:29.602+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.789" : { "csid" : NumberInt(619409), "cid" : NumberInt(110998), "prgid" : NumberInt(1295), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-05T19:01:22.151+0100"), "edt" : ISODate("2019-11-08T19:01:22.151+0100") } } } } { "ts" : Timestamp(1573061923, 104), "t" : NumberLong(3), "h" : NumberLong(-4679696690908483765), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-06T18:38:43.339+0100"), "o" : { "$v" : NumberInt(1), "$unset" : { "b.0.custom.vapid" : true }, "$set" : { "b.0.custom.vapid_bkp" : "BJd3ZjJadsnCuS84cVYn-eh8fATuA4HC-rb82ZNIiwqteHtIh2EHwayBv_ddH1AaMet_tU5p-k8de4QycxsNH14", "h.790" : { "csid" : NumberInt(619709), "cid" : NumberInt(110972), "prgid" : NumberInt(1211), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-06T18:38:23.323+0100"), "edt" : ISODate("2019-11-09T18:38:23.323+0100") } } } } { "ts" : Timestamp(1573076962, 22), "t" : NumberLong(3), "h" : NumberLong(-174267231994253759), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-06T22:49:22.254+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.0.fhc" : NumberInt(2), "b.0.fjcdt" : ISODate("2019-11-06T22:49:14.000+0100"), "gfhc" : NumberInt(2), "gfjcdt" : ISODate("2019-11-06T22:49:14.000+0100") } } } { "ts" : Timestamp(1573115960, 320), "t" : NumberLong(3), "h" : NumberLong(7811364913536296306), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-07T09:39:20.753+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.791" : { "csid" : NumberInt(619710), "cid" : NumberInt(110975), "prgid" : NumberInt(723), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-07T09:39:09.083+0100"), "edt" : ISODate("2019-11-10T09:39:09.083+0100") } } } } { "ts" : Timestamp(1573119238, 376), "t" : NumberLong(3), "h" : NumberLong(7480786926605835507), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5bf2881fa4efa2c20838b369") }, "wall" : ISODate("2019-11-07T10:33:58.944+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.792" : { "csid" : NumberInt(619795), "cid" : NumberInt(111072), "prgid" : NumberInt(682), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-07T10:33:50.231+0100"), "edt" : ISODate("2019-11-07T22:33:50.231+0100") } } } }

thanks Seb

kinesra75 commented 4 years ago

I see that the update between the two first are only 10ms. And the error is on that update. Maybe a clue ? :)

Regards, Seb

vinllen commented 4 years ago

The time(2019-11-05 10:20:04) in error oplog in MongoShake:

[14:28:16 CET 2019/11/06] [CRIT] (mongoshake/executor.(*Executor).execute:97) Replayer-1, executor-1, oplog for namespace[ARIANELAB.user_history] op[u] failed. error type[*errors.errorString] error[doUpdate run upsert/update[true] failed[Updating the path 'b.0.mhc' would create a conflict at 'b.0.mhc']], logs number[256], firstLog: [{ts 6755765458168709606} {op u} {g } {ns ARIANELAB.user_history} {o [{$set [{b.0.mhc 3} {b.0.mjcdt 2019-11-05 10:20:04 +0000 UTC} {gmhc 3} {gmjcdt 2019-11-05 10:20:04 +0000 UTC}]} {$set [{b.0.mhc 3} {b.0.mjcdt 2019-11-05 10:20:04 +0000 UTC} {gmhc 3} {gmjcdt 2019-11-05 10:20:04 +0000 UTC}]}]} {o2 map[_id:ObjectIdHex("5bf2881fa4efa2c20838b369")]} {uk map[]} {lsid } {fromMigrate false}]

is not matched what you give later(11-05T11:20:08).

vinllen commented 4 years ago

Did you mix the $set and $setOnInsert operations? https://stackoverflow.com/questions/50947772/updating-the-path-x-would-create-a-conflict-at-x

kinesra75 commented 4 years ago

No, we don't mix those operations ..

kinesra75 commented 4 years ago

Hi,

This is an other log from mongoshake and oplog if it could help :+1:

[08:30:39 CET 2019/11/15] [CRIT] (mongoshake/executor.(*Executor).execute:97) Replayer-1, executor-1, oplog for namespace[ARIANELAB.user_history] op[u] failed. error type[*errors.errorString] error[doUpdate run upsert/update[true] failed[Updating the path 'h.400.oct' would create a conflict at 'h.400.oct']], logs number[19], firstLog: [{ts 6759213719906943404} {op u} {g } {ns ARIANELAB.user_history} {o [{$set [{h.400.oct 1} {h.400.odt 2019-10-15 08:34:22 +0000 UTC}]} {$set [{h.400.oct 1} {h.400.odt 2019-10-15 08:34:22 +0000 UTC}]}]} {o2 map[_id:ObjectIdHex("5cd75a47a4efa211b45de0d3")]} {uk map[]} {lsid <nil>} {fromMigrate false}]

and associated oplog :

{ "ts" : Timestamp(1573713015, 107), "t" : NumberLong(3), "h" : NumberLong(-440790866935711729), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T07:30:15.083+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.398.oct" : NumberInt(1), "h.398.odt" : ISODate("2019-10-14T15:58:23.000+0200") } } } { "ts" : Timestamp(1573728020, 223), "t" : NumberLong(3), "h" : NumberLong(-2240993578532278145), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T11:40:20.565+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.1.mhc" : NumberInt(5), "b.1.mjc" : NumberInt(1), "b.1.mjcdt" : ISODate("2019-11-14T11:40:14.000+0100"), "gmhc" : NumberInt(8), "gmjc" : NumberInt(2) } } } { "ts" : Timestamp(1573752081, 428), "t" : NumberLong(3), "h" : NumberLong(-3382665026472535050), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T18:21:21.939+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.400.oct" : NumberInt(1), "h.400.odt" : ISODate("2019-10-15T10:34:22.000+0200") } } } { "ts" : Timestamp(1573755605, 103), "t" : NumberLong(3), "h" : NumberLong(70599509655983247), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T19:20:05.399+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.401.oct" : NumberInt(1), "h.401.odt" : ISODate("2019-10-15T11:30:55.000+0200") } } } { "ts" : Timestamp(1573756379, 320), "t" : NumberLong(3), "h" : NumberLong(-1034654860945191738), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T19:32:59.715+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.1.fhc" : NumberInt(3), "b.1.fjcdt" : ISODate("2019-11-14T19:32:38.000+0100"), "gfhc" : NumberInt(5), "gfjcdt" : ISODate("2019-11-14T19:32:38.000+0100") } } } { "ts" : Timestamp(1573764924, 637), "t" : NumberLong(3), "h" : NumberLong(2227487652791395282), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T21:55:24.897+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.402.oct" : NumberInt(1), "h.402.odt" : ISODate("2019-10-15T14:34:29.000+0200") } } } { "ts" : Timestamp(1573771347, 472), "t" : NumberLong(3), "h" : NumberLong(-6927610633294193173), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-14T23:42:27.665+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.403.oct" : NumberInt(1), "h.403.odt" : ISODate("2019-10-15T18:21:23.000+0200") } } } { "ts" : Timestamp(1573772514, 139), "t" : NumberLong(3), "h" : NumberLong(7126840334412444867), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T00:01:54.944+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.506" : { "csid" : NumberInt(620441), "cid" : NumberInt(111156), "prgid" : NumberInt(682), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-15T00:01:43.620+0100"), "edt" : ISODate("2019-11-15T12:01:43.620+0100") } } } } { "ts" : Timestamp(1573789179, 52), "t" : NumberLong(3), "h" : NumberLong(2414235587838556238), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T04:39:39.095+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.406.oct" : NumberInt(1), "h.406.odt" : ISODate("2019-10-15T23:46:52.000+0200") } } } { "ts" : Timestamp(1573789188, 590), "t" : NumberLong(3), "h" : NumberLong(3677033406152972852), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T04:39:48.420+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.405.oct" : NumberInt(1), "h.405.odt" : ISODate("2019-10-15T23:46:52.000+0200") } } } { "ts" : Timestamp(1573808498, 316), "t" : NumberLong(3), "h" : NumberLong(-7898441827584094993), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T10:01:38.353+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "b.1.mhc" : NumberInt(6), "b.1.mjcdt" : ISODate("2019-11-15T10:00:53.000+0100"), "gmhc" : NumberInt(9), "gmjc" : NumberInt(1), "gmjcdt" : ISODate("2019-11-15T10:00:53.000+0100") } } } { "ts" : Timestamp(1573809999, 344), "t" : NumberLong(3), "h" : NumberLong(6598691881961740604), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T10:26:39.826+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.507" : { "csid" : NumberInt(620351), "cid" : NumberInt(111143), "prgid" : NumberInt(1173), "rtg" : NumberInt(0), "rdt" : ISODate("2019-11-15T10:26:18.896+0100"), "edt" : ISODate("2019-11-18T10:26:18.896+0100") } } } } { "ts" : Timestamp(1573814770, 479), "t" : NumberLong(3), "h" : NumberLong(-61922982790590641), "v" : NumberInt(2), "op" : "u", "ns" : "ARIANELAB.user_history", "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, "wall" : ISODate("2019-11-15T11:46:10.878+0100"), "o" : { "$v" : NumberInt(1), "$set" : { "h.407.oct" : NumberInt(1), "h.407.odt" : ISODate("2019-10-16T04:49:14.000+0200") } } }

Thanks for your help !

Regards Sébastien Boucard

vinllen commented 4 years ago

There are two $set operations in the oplog in the mongoshake log:

{ts 6759213719906943404}
{op u}
{g }
{ns ARIANELAB.user_history}
{o
    [
        {$set [
            {h.400.oct 1}
            {h.400.odt 2019-10-15 08:34:22 +0000 UTC}
        ]}
        {$set [
            {h.400.oct 1}
            {h.400.odt 2019-10-15 08:34:22 +0000 UTC}
        ]}
    ]
}
{o2 map[_id:ObjectIdHex("5cd75a47a4efa211b45de0d3")]}
{uk map[]}
{lsid <nil>}
{fromMigrate false}

But in the raw oplog, there is only one $set:

{
    "ts" : Timestamp(1573752081, 428), 
    "t" : NumberLong(3), 
    "h" : NumberLong(-3382665026472535050), 
    "v" : NumberInt(2), 
    "op" : "u", 
    "ns" : "ARIANELAB.user_history", 
    "ui" : UUID("5d081004-1cce-454b-b782-c8dfbc3c296d"), 
    "o2" : { "_id" : ObjectId("5cd75a47a4efa211b45de0d3") }, 
    "wall" : ISODate("2019-11-14T18:21:21.939+0100"), 
    "o" : {
        "$v" : NumberInt(1), 
        "$set" : {
            "h.400.oct" : NumberInt(1), 
            "h.400.odt" : ISODate("2019-10-15T10:34:22.000+0200")
        }
    }
} 

It looks like something is wrong.

vinllen commented 4 years ago

@Ars3nik Are u sure the mongoshake version is v2.0.8? Do u enable filter or transform in configuration? Could u paste your configuration(collector.conf)?

kinesra75 commented 4 years ago

Hi @vinllen,

This is my configation file and yes the version is 2.0.8. I use filter "white list", and I don't transform namespace.

|# this is the configuration of mongo-shake.
|# if this is your first time using mongo-shake, you can only configue source mongodb address('mongo_urls')
|# and target mongodb address('tunnel.address').
|# if you have any problem, please visit https://github.com/alibaba/MongoShake/wiki/FAQ
|# 如果有问题,请先查看FAQ文档以及wiki上的说明。

|# ----------------------splitter----------------------

|# connect source mongodb, set username and password if enable authority.
|# split by comma(,) if use multiple instance in one replica-set. E.g., mongodb://username1:password1@primaryA,secondaryB,secondaryC
|# split by semicolon(;) if sharding enable. E.g., mongodb://username1:password1@primaryA,secondaryB,secondaryC;mongodb://username2:password2@primaryX,secondaryY,secondaryZ
|# 源MongoDB连接串信息,逗号分隔同一个副本集内的结点,分号分隔分片sharding实例,免密模式
|# 可以忽略“username:password@”。
|# 举例:
|# 副本集:mongodb://username1:password1@primaryA,secondaryB,secondaryC
|# 分片集:mongodb://username1:password1@primaryA,secondaryB,secondaryC;mongodb://username2:password2@primaryX,secondaryY,secondaryZ
mongo_urls = mongodb://10.x.x.x:27017

|# connect mode:
|# primary: fetch data from primary.
|# secondaryPreferred: fetch data from secondary if has, otherwise primary.(default)
|# standalone: fetch data from given 1 node, no matter primary, secondary or hidden. This is only
|# support when tunnel type is direct.
|# 连接模式,primary表示从主上拉取,secondaryPreferred表示优先从secondary拉取(默认建议值),
|# standalone表示从任意单个结点拉取。
mongo_connect_mode = secondaryPreferred

|# collector name
|# id用于输出pid文件等信息。
collector.id = mongoshake

|# sync mode: all/document/oplog. default is oplog.
|# all means full synchronization + incremental synchronization.
|# document means full synchronization.
|# oplog means incremental synchronization.
|# 同步模式,all表示全量+增量同步,document表示全量同步,oplog表示增量同步。
sync_mode = oplog

|# http api interface. Users can use this api to monitor mongoshake.
|# We also provide a restful tool named "mongoshake-stat" to
|# print ack, lsn, checkpoint and qps information based on this api.
|# usage: './mongoshake-stat --port=9100'
|# restful端口,可以查看metric统计情况。
http_profile = 19100
|# profiling on net/http/profile
|# profiling端口,用于查看内部go堆栈。
system_profile = 19200

|# global log level: debug, info, warning, error. lower level message will be filter
log.level =  info
|# log directory. log and pid file will be stored into this file.
|# if not set, default is "./logs/"
|# log和pid文件的目录,如果不设置默认打到当前路径的logs目录。
log.dir =
|# log file name.
|# log文件名。
log.file = collector.log
|# log buffer or unbuffer. If set true, logs may not be print when exit. If
|# set false, performance will be decreased extremely
|# 设置log缓存,true表示包含缓存,如果false那么每条log都会直接刷屏,但对性能有影响;
|# 反之,退出不一定能打印所有的log,调试时建议配置false。
log.buffer = true

|# filter db or collection namespace. at most one of these two parameters can be given.
|# if the filter.namespace.black is not empty, the given namespace will be
|# filtered while others namespace passed.
|# if the filter.namespace.white is not empty, the given namespace will be
|# passed while others filtered.
|# all the namespace will be passed if no condition given.
|# db and collection connected by the dot(.).
|# different namespaces are split by the semicolon(;).
|# filter: filterDbName1.filterCollectionName1;filterDbName2
|# 黑白名单过滤,目前不支持正则,白名单表示通过的namespace,黑名单表示过滤的namespace,
|# 不能同时指定。分号分割不同namespace,每个namespace可以是db,也可以是db.collection。
filter.namespace.black =
filter.namespace.white = COMPANY.user_history;COMPANY.user_event_open;COMPANY.user_event;COMPANY.user_newsletter;COMPANY_2.user_history;COMPANY_2.user_event_open;COMPANY_2.user_event;COMPANY_2.user_newsletter
|#filter.namespace.white = COMPANY.user_history;COMPANY_2.user_history

|# some databases like "admin", "local", "mongoshake", "config", "system.views" are
|# filtered, users can enable these database based on some special needs.
|# different database are split by the semicolon(;).
|# e.g., admin;mongoshake.
|# pay attention: collection isn't support like "admin.xxx" except "system.views"
|# 正常情况下,不建议配置该参数,但对于有些非常特殊的场景,用户可以启用admin,mongoshake等库的同步,
|# 以分号分割,例如:admin;mongoshake。
filter.pass.special.db =

|# this parameter is not supported on current open-source version.
|# oplog namespace and global id. others oplog in
|# mongo cluster that has distinct global id will
|# be discard. Query without gid (means getting all
|# oplog out) if no oplog.gid set
|# gid用于双活防止环形复制,目前只用于阿里云云上MongoDB,如果是阿里云云上实例互相同步
|# 希望开启gid,请联系阿里云售后或者烛昭(vinllen),sharding的有多个gid请以分号(;)分隔。
oplog.gids =

|# [auto]   decide by if there has unique index in collections.
|#       use 'collection' if has unique index otherwise use 'id'.
|# [id]    shard by ObjectId. handle oplogs in sequence by unique _id
|# [collection]  shard by ns. handle oplogs in sequence by unique ns
|# hash的方式,id表示按文档hash,collection表示按表hash,auto表示自动选择hash类型。
|# 如果没有索引建议选择id达到非常高的同步性能,反之请选择collection。
shard_key = collection

|# oplog transmit worker concurrent
|# if the source is sharding, worker number must equal to shard numbers.
|# 内部发送的worker数目,如果机器性能足够,可以提高worker个数。
worker = 64
|# memory queue configuration, plz visit FAQ document to see more details.
|# do not modify these variables if the performance and resource usage can
|# meet your needs.
|# 内部队列的配置参数,如果目前性能足够不建议修改,详细信息参考FAQ。
worker.batch_queue_size = 64
adaptive.batching_max_size = 2048
fetcher.buffer_capacity = 512
|# batched oplogs have block level checksum value using
|# crc32 algorithm. and compressor for compressing content
|# of oplog entry.
|# supported compressor are : gzip,zlib,deflate
|# Do not enable this option when tunnel type is "direct"
|# 是否启用发送,非direct模式发送可以选择压缩以减少网络带宽消耗。
worker.oplog_compressor = none

|# tunnel pipeline type. now we support rpc,file,kafka,mock,direct
|# 通道模式。
tunnel = direct
|# tunnel target resource url
|# for rpc. this is remote receiver socket address
|# for tcp. this is remote receiver socket address
|# for file. this is the file path, for instance "data"
|# for kafka. this is the topic and brokers address which split by comma, for
|# instance: topic@brokers1,brokers2, default topic is "mongoshake"
|# for mock. this is uesless
|# for direct. this is target mongodb address which format is the same as 'mongo_urls'. If
|# the target is sharding, this should be the mongos address.
|# direct模式用于直接写入MongoDB,其余模式用于一些分析,或者远距离传输场景,
|# 注意,如果是非direct模式,需要通过receiver进行解析,具体参考FAQ文档。
|# 此处配置通道的地址,格式与mongo_urls对齐。
tunnel.address = mongodb://10.x.x.x:27017

|# collector context storage mainly including store checkpoint.
|# checkpoint存储信息,checkpoint本身是一个64位的时间戳表示本次开始拉取的地址。
|# type include : database, api
|# for api storage, address is http url
|# for database storage, address is collection name while db name is "mongoshake" by default.
|# checkpoint存储的地址,database表示存储到MongoDB中,api表示提供http的接口写入checkpoint。
context.storage = database
|# context.storage.url is only used in to mark the checkpoint store database.
|# If the source mongodb type is sharding, the address should be config server when MongoShake's
|# version >= 1.5, otherwise, this is replicaSet address.
|# When source mongodb type is replicaSet, checkpoint will write into source mongodb as default
|# if 'context.storage.url' is not set, otherwise, the checkpoint will be written into this
|# mongodb. E.g., mongodb://127.0.0.1:20070
|# checkpoint的具体写入的MongoDB地址,不配置对于副本集将写入源库(db=mongoshake),对于分片集
|# 将写入config-server(db=admin)
context.storage.url =
|# checkpoint collection's name.
|# checkpoint存储的表的名字,如果启动多条mongoshake拉取同一个源可以修改这个表名,
|# 当然也可以修改上面的数据库名。
context.address = ckpt_default
|# real checkpoint: the fetching oplog position.
|# pay attention: this is UTC time which is 8 hours latter than CST time. this
|# variable will only be used when checkpoint is not exist.
|# 本次开始拉取的位置,如果checkpoint已经存在(参加上述存储位置)则该参数无效,
|# 如果需要强制该位置开始拉取,需要先删除原来的checkpoint,详见FAQ。
context.start_position = 2000-01-01T00:00:01Z

|# high availability option.
|# enable master election if set true. only one mongoshake can become master
|# and do sync, the others will wait and at most one of them become master once
|# previous master die. The master information stores in the 'mongoshake' db in the source
|# database by default.
|# This option is useless when there is only one mongoshake running.
|# 如果开启主备mongoshake拉取同一个源端,此参数需要开启。
master_quorum = false

|# transform from source db or collection namespace to dest db or collection namespace.
|# at most one of these two parameters can be given.
|# transform: fromDbName1.fromCollectionName1:toDbName1.toCollectionName1;fromDbName2:toDbName2
|# 转换命名空间,比如a.b同步后变成c.d,谨慎建议开启,比较耗性能。
transform.namespace =
|# if use dbref in document rename, need to set it true, but it will decrease performance of replication
|# 如果有dbref操作,这个需要置true,谨慎建议开启,比较耗性能。
dbref = false

|# ----------------------splitter----------------------
|# if tunnel type is direct, all the below variable should be set
|# 写入MongoDB的参数配置。

|# only transfer oplog commands for syncing. represent
|# by oplog.op are "i","d","u".
|# DDL will be transferred if disable like create index, drop databse,
|# transaction in mongodb 4.0.
|# 是否需要开启DDL同步,false表示开启,源是sharding暂时不支持开启。
|# 如果目的端是sharding,暂时不支持applyOps命令,包括事务。
replayer.dml_only = true

|# oplog changes to Insert while Update found non-exist (_id or unique-index)
|# 是否将update语句修改为insert语句,如果_id不存在在目的库。
replayer.executor.upsert = true
|# oplog changes to Update while Insert found duplicated key (_id or unique-index)
|# 是否将insert语句修改为update语句,如果_id存在在目的库。
replayer.executor.insert_on_dup_update = true
|# db. write duplicated logs to mongoshake_conflict
|# sdk. write duplicated logs to sdk.
|# 如果写入存在冲突,记录冲突的文档。
replayer.conflict_write_to = none

|# replayer duration mode. drop oplogs and take
|# no any action(only for debugging enviroment) if
|# set to false. otherwise write to ${mongo_url} instance
|# false表示取消写入,只用于调试。
replayer.durable = true

|# ----------------------splitter----------------------
|# full synchronization configuration
|# 全量同步参数配置,详见wiki,如果目的端MongoDB压力过大,可以适当调低下列参数。

|# the number of collection concurrence
|# 并发最大拉取的表个数。
replayer.collection_parallel = 6

|# the number of document concurrence in a collection concurrence
|# 同一个表内部并发的线程数。
replayer.document_parallel = 16

|# number of documents in a batch insert in a document concurrence
|# 目的端写入的batch大小。
replayer.document_batch_size = 1024

|# drop the same name of collection in dest mongodb in full synchronization
|# 同步时如果目的库存在,是否先删除目的库再进行同步。
replayer.collection_drop = false

Regards, Seb.

Jonnter commented 4 years ago

请问下,问题有解决么? 我同样遇到了该问题

vinllen commented 4 years ago

@Jonnter 回答一下下面几个问题

vinllen commented 4 years ago

problem will be solved at v2.4.4, seee #345