yinqiwen / ardb

A redis protocol compatible nosql, it support multiple storage engines as backend like Google's LevelDB, Facebook's RocksDB, OpenLDAP's LMDB, PerconaFT, WiredTiger, ForestDB.
BSD 3-Clause "New" or "Revised" License
1.83k stars 278 forks source link

Replication doesnt behave like Redis #42

Closed pedigree closed 10 years ago

pedigree commented 10 years ago

I've just done some testing with blank databases (rm ./data and ./repl folders) and slaveof 10.0.1.120:6379 in ardb.conf. when I start ardb-server, i would expect to see a full sync with the master (in my case, the master is a read only slave to an upstream master) but ardb issues a psync and only gets a very small set of data from the master, which is 2gb

ardb console shows

[97718] 04-27 10:38:23,706 INFO Init storage engine success.
[97718] 04-27 10:38:23,707 WARN No zookeeper servers specified, zookeeper agent would not start.
[97718] 04-27 10:38:23,707 INFO Server started, Ardb version 0.7.1
[97718] 04-27 10:38:23,707 INFO The server is now ready to accept connections on port 2222
[97718] 04-27 10:38:23,709 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[97718] 04-27 10:38:23,710 INFO Send PSYNC 8e7595dbe06494799fe6b38fd875e718426bb9c6 1 \n
[97718] 04-27 10:38:23,710 INFO Recv psync reply:FULLRESYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 33246672 
[97718] 04-27 10:38:41,071 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1 
[97718] 04-27 10:38:41,071 INFO Send PSYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 33246672 
[97718] 04-27 10:38:41,072 INFO Recv psync reply:CONTINUE 

network listening shows the following (10.0.1.120 is the readonly master, 10.0.1.50 is the ardb server)

###
T 10.0.1.50:32067 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:32067 -> 10.0.1.120:6379 [AP]
  info Server..
##
T 10.0.1.120:6379 -> 10.0.1.50:32067 [AP]
  $412..# Server..redis_version:2.8.7..redis_git_sha1:00000000..redis_git_dirty:0..redis_build_id:4e504c85df90d66d
  ..redis_mode:standalone..os:Linux 3.14.0-031400-generic x86_64..arch_bits:64..multiplexing_api:epoll..gcc_versio
  n:4.8.1..process_id:1040..run_id:57b1d385ee78fd25afa5eba471834d4e30c7f6f3..tcp_port:6379..uptime_in_seconds:3906
  55..uptime_in_days:4..hz:10..lru_clock:1447474..config_file:/etc/redis/6379.conf....
#
T 10.0.1.50:32067 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:32067 -> 10.0.1.120:6379 [AP]
  replconf listening-port 2222..
#
T 10.0.1.120:6379 -> 10.0.1.50:32067 [AP]
  +OK..
#
T 10.0.1.50:32067 -> 10.0.1.120:6379 [AP]
  psync 8e7595dbe06494799fe6b38fd875e718426bb9c6 1..
#
T 10.0.1.120:6379 -> 10.0.1.50:32067 [AP]
  +FULLRESYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 33246672..
#
T 10.0.1.50:32067 -> 10.0.1.120:6379 [A]
  ......
##
T 10.0.1.50:32067 -> 10.0.1.120:6379 [AF]
  ......
####
T 10.0.1.50:32151 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [AP]
  info Server..
##
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  $412..# Server..redis_version:2.8.7..redis_git_sha1:00000000..redis_git_dirty:0..redis_build_id:4e504c85df90d66d
  ..redis_mode:standalone..os:Linux 3.14.0-031400-generic x86_64..arch_bits:64..multiplexing_api:epoll..gcc_versio
  n:4.8.1..process_id:1040..run_id:57b1d385ee78fd25afa5eba471834d4e30c7f6f3..tcp_port:6379..uptime_in_seconds:3906
  72..uptime_in_days:4..hz:10..lru_clock:1447476..config_file:/etc/redis/6379.conf....
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [AP]
  replconf listening-port 2222..
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  +OK..
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [AP]
  psync 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 33246672..
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  +CONTINUE..
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [A]
  .*2..$6..SELECT..$1..0..*4..$4..HSET..$2......$14....{..R....J.....$6....\S.@..*4..$4..HSET..$2......$14..N....5
  oR..$i5...$6....\S....*4..$4..HSET..$2......$4..nY....$6....\SN...*4..$4..HSET..$2......$14..B.Wg..7.YI..$h..$6.
  ...\SG...*4..$4..HSET..$2..=...$14.............@x,..$6....\S_...*4..$4..HSET..$2..l...$14..I.l8.......).s..$6...
  .\S....*4..$4..HSET..$2......$14....VW/./..f......$6....\S.2..*4..$4..HSET..$2..L...$4....'...$6....\S....*4..$4
  ..HSET..$2...&..$4.....0..$6....\SN...*4..$4..HSET..$2..V^..$14..v...$;.*.u C....$6....\SC...*4..$4..HSET..$2...
  ...$14....?...H8.....7..$6....\SI...*4..$4..HSET..$2......$14..A..._.......l(..$6....\S....*4..$4..HSET..$2.._..
  .$4..x+....$6....\S....*4..$4..HSET..$2......$14..a5...T...l...a..$6....\S....*4..$4..HSET..$2...|..$14..|H...9]
  /;..^....$6....\S....*4..$4..HSET..$2..W\..$14...@u......Wt..X..$6....\S....*4..$4..HSET..$2...7..$4..;:qz..$6..
  ..\S....*4..$4..HSET..$2......$14....GF.M..62......$6....\S....*4..$4..HSET..$2...=..$4........$6....\Ss...*4..$
  4..HSET..$2..;K..$14..<...s.N..4.. 7..$6....\S....*4..$4..HSET..$2......$4.....:..$6....\S....*4..$4..HSET..$2..
  ....$14...pcy^.6.........$6....\S3...*4..$4..HSET..$2......$14..x.s. ....C.R.i..$6....\S....*4..$4..HSET..$2..PO
  ..$4..._....$6....\SM...*4..$4..HSET..$2......$14......]6..*...D...$6....\S....*4..$4..HSET..$2..=...$14..S&....
  ..}r......$6....\S(...*4..$4..HSET..$2......$14..p.{.nnm....rH...$6....\S....*4..$4..HSET..$2.._...$4..nY5...$6.
  ...\S&...*4..$4..HSET..$2......$14..D_..3Q.....;....$6....\S....*4..$4..HSET..$2..Wq..$4..<..j..$6....\SY...*4..
  $4..HSET..$2..>t..$14.....T............$6....\S....*4..$4..HSET..$2...j..$14..P.n...Ms...Ji...$6....\S....*4..$4
  ..HSET..$2..X...$4........$6....\S....*4..$4..HSET..$2..*1..$14....'....\.q\..t..$6....\S?...*4..$4..HSET..$2...
  ...$14..(aT."TJ.....oz..$6....\S....*4..$4..HSET..$2...!..$14..fx].kP..?L.4I...$6....\S....*4..$4..HSET..$2.....
  .$14....'.G..\ .U..<..$6....\Sz...*4..$4..HSET..$2..:^..$14............L{.k..$6....\S....*4..$4..HSET..$2......$
  4....$]..$6....\S....*4..$4..HSET..$2......$4..nR....$6....\S....*4..$4..HSET..$2..<|..$14..$G..............$6..
  ..\S.f..*4..$4..HSET..$2..9...$14....B7U7.r5.<.....$6....\S....*4..$4..HSET..$2../...$4........$6....\S/...*4..$
  4..HSET..$2...S..$14...3...M..&..0....$6....\S....*4..$4..HSET..$2...x..$4..x+....$6....\SY/..*4..$4..HSET..$2..
  ....$14..1.l.0"..........$6....\S....*4..$4..HSET..$2......$14......).....j?....$6....\S....*4..$4..HSET..$2....
  ..$4...&>...$6....\S....*4..$4..HSET..$2..D...$14..C}t.F#..8."..c..$6....\S....*4..$4..HSET..$2..Q...$14......pD
  .jM2.;....$6....\S....*4..$4..HSET..$2......$14....Gs...y........$6....\S*...*4..$4..HSET..$2...K..$14..x.pzm.9r
  0.......$6....\S....*4..$4..HSET..$2..J_..$14...`aB....92.+....$6....\S....*4..$4..HSET..$2..!Y..$14..q..c.2....
  .J!...$6....\S....*4..$4..HSET..$2.../..$4..[..v..$6....\S....*4..$4..HSET..$2...4..$14....J....Sd....c..$6....\
  Sy...*4..$4..HSET..$2...c..$4..u.g...$6....\S....*4..$4..HSET..$2......$14..-..prk..3..x.y..$6....\S.7..*4..$4..
  HSET..$2..+...$4..x%.-..$6....\Sb...*4..$4..HSET..$2...$..$14....V...!Ij.......$6....\S....*4..$4..HSET..$2..;].
  .$14..?..J..2.... X...$6....\S....*4..$4..HSET..$2......$4...&>...$6....\S....*4..$4..HSET..$2..EC..$4...,.g..$6
  ....\S.i..*4..$4..HSET..$2..%...$14..A(.5Yq..^.}.....$6....\S....*4..$4..HSET..$2...'..$14.."Y.Q.Q....q.|b..$6..
  ..\S....*4..$4..HSET..$2..j...$14.....zn..;....}~..$6....\S....*4..$4..HSET..$2......$14......c..d........$6....
  \S....*4..$4..HSET..$2......$14...=..".?.2..C.H..$6....\S....*4..$4..HSET..$2......$4.....c..$6....\Sz...*4..$4.
  .HSET..$2..FJ..$14....'...."U!^nD...$6....\S....*4..$4..HSET..$2......$14...7......[.@]....$6....\S....*4..$4..H
  SET..$2......$4...M....$6....\S....*4..$4..HSET..$2../O..$4.....P..$6....\S....*4..$4..HSET..$2......$14....IL&.
  .',.2..=..$6....\S....*4..$4..HSET..$2...H..$14........k.........$6....\Sh...*4..$4..HSET..$2...Z..$14......&...
  ..|h....$6....\S....*4..$4..HSET..$2..\C..$4...)....$6....\S.C..*4..$4..HSET..$2...g..$14.......E...Iw.....$6...
  .\S.c..*4..$4..HSET..$2......$14......`.?.5..l.#..$6....\S....*4..$4..HSET..$2...r..$14...|.KTQ8.gH.>k;..$6....\
  S....*4..$4..HSET..$2......$14..<,I..z$I.....U..$6....\S....*4..$4..HSET..$2......$4...&>...$6....\S....*4..$4..
  HSET..$2...|..$4........$6....\S."..*4..$4..HSET..$2......$4...&>...$6....\S....*4..$4..HSET..$2..5...$14..{.^L.
  +;A0s.R.3..$6....\So...*4..$4..HSET..$2......$14...m.~...%.'......$6....\S!...*4..$4..HSET..$2......$4...,....$6
  ....\S....*4..$4..HSET..$2......$14..'...*.ais~H..r..$6....\S<...*4..$4..HSET..$2..{...$14..CnN....*D.......$6..
  ..\S....*4..$4..HSET..$2......$4..x+....$6....\S."..*4..$4..HSET..$2..FJ..$14....'...."U!^nD...$6....\S....*4..$
  4..HSET..$2..e...$14...7[K.|Z....{....$6....\S....*4..$4..HSET..$2../O..$4.....P..$6....\S....*4..$4..HSET..$2..
  .X..$14...(F....}...7.T..$6....\S. ..*4..$4..HSET..$2......$14..'...*.ais~H..r..$6....\S=...*4..$4..HSET..$2..#.
  ..$14.......E%D.....T..$6....\S....*4..$4..HSET..$2......$4..x+....$6....\S."..*4..$4..HSET..$2..@...$14...a.R#r
  ...n..Vg..$6....\S....*4..$4..HSET..$2...v..$4....%...$6....\Sq...*4..$4..HSET..$2......$14..a5...T...l...a..$6.
  ...\S....*4..$4..HSET..$2......$14..Kn.e4...F.".....$6....\Sw...*4..$4..HSET..$2......$14..9.6...s`...).}..$6...
  .\S....*4..$4..HSET..$2...{..$14...%.|.....'..L...$6....\S....*4..$4..HSET..$2......$4..po....$6....\S....*4..$4
  ..HSET..$2......$14..@.6.......;L.#..$6....\S....*4..$4..HSET..$2......$4.....:..$6....\S....*4..$4..HSET..$2...
  H..$14.......".IM.i..B..$6....\S!...*4..$4..HSET..$2......$14..(V...n.....J.[..$6....\S....*4..$4..HSET..$2..V^.
  .$14..v...$;.*.u C....$6....\SD...*4..$4..HSET..$2......$14..@g..u.C.Ie1.....$6....\S"...*4..$4..HSET..$2...=..$
  4........$6....\St...*4..$4..HSET..$2...1..$14.....1..y..r7}....$6....\S*...*4..$4..HSET..$2......$14......%....
  .5:.)..$6....\S.`..*4..$4..HSET..$2..U...$14..fW.Mo.c.....!$..$6....\S ...*4..$4..HSET..$2..)}..$14..+....[.h_}.
  ..Z..$6....\S....*4..$4..HSET..$2..!...$4.....*..$6....\S....*4..$4..HSET..$2...h..$4.....V..$6....\S. ..*4..$4.
  .HSET..$2......$14...n..a@....}..N..$6....\S....*4..$4..HSET..$2...b..$14..j..u.vWs..[.e%..$6....\S....*4..$4..H
  SET..$2..D;..$14..A*zAu*.I'.......$6....\S....*4..$4..HSET..$2..a$..$14......O:z5.H5x90..$6....\S....*4..$4..HSE
  T..$2...@..$4........$6....\S....*4..$4..HSET..$2......$4........$6....\S....*4..$4..HSET..$2......$14........5.
  ...`....$6....\S....*4..$4..HSET..$2......$14..,u,[........,...$6....\S....*4..$4..HSET..$2..E/..$14..0...e....;
  X.....$6....\S....*4..$4..HSET..$2..<i..$4..e.(...$6....\S....*4..$4..HSET..$2..m...$4...,....$6....\S~Z..*4..$4
  ..HSET..$2..l...$4.....j..$6....\S.n..*4..$4..HSET..$2..s...$14...>,..B..:w9l....$6....\S....*4..$4..HSET..$2...
  !..$14....v.._..cro5....$6....\S....*4..$4..HSET..$2..O...$4..nR.G..$6....\S....*4..$4..HSET..$2...[..$14....1[.
  .|..O!Fo...$6....\SQ...*4..$4..HSET..$2..j...$14.....zn..;....}~..$6....\S....*4..$4..HSET..$2..j...$14.....zn..
  ;....}~..$6....\S....*4..$4..HSET..$2...L..$14...T...m.&.g..{...$6....\S....*4..$4..HSET..$2..<...$14..@.....}`T
  .ieA<..$6....\S....*4..$4..HSET..$2......$4...M....$6....\S....*4..$4..HSET..$2......$4...M....$6....\S....*4..$
  4..HSET..$2......$14....2a.....#......$6....\S....*4..$4..HSET..$2......$14.....7Fc.J%.F..K..$6....\S....*4..$4.
  .HSET..$2..,L..$14..Q.W~P}a..t.~.I..$6....\S....*4..$4..HSET..$2......$14........4o$...D...$6....\S....*4..$4..H
  SET..$2...c..$14......bei.........$6....\S....*4..$4..HSET..$2......$14...>..W....8/.....$6....\S ...*4..$4..HSE
  T..$2.._...$4....$I..$6....\Sz...*4..$4..HSET..$2......$4..jx....$6....\SW...*4..$4..HSET..$2...v..$4........$6.
  ...\S....*4..$4..HSET..$2..Z...$14...w...02../+../..$6....\Sl...*4..$4..HSET..$2......$4...X.e..$6....\Sh...*4..
  $4..HSET..$2..O...$14.. ...|.....f.....$6....\SU...*4..$4..HSET..$2...}..$14...E..............$6....\S,...*4..$4
  ..HSET..$2......$4..po....$6....\S....*4..$4..HSET..$2...>..$14..h..0...;........$6....\S....*4..$4..HSET..$2..Q
  _..$14......d".E...z....$6....\S....*4..$4..HSET..$2......$4.....:..$6....\S'...*4..$4..HSET..$2..@...$14...W.7.
  ...._*.`<..$6....\Sd...*4..$4..HSET..$2..VL..$14..`Y........j.....$6....\S....*4..$4..HSET..$2..UW..$4..x+.Q..$6
  ....\S....*4..$4..HSET..$2..f...$14......3.8p.J..t...$6....\Sy...*4..$4..HSET..$2..E...$14.............:_?..$6..
  ..\S....*4..$4..HSET..$2...'..$14.."Y.Q.Q....q.|b..$6....\S....*4..$4..HSET..$2......$14..(aT."TJ.....oz..$6....
  \S....*4..$4..HSET..$2......$4..po....$6....\S.,..*4..$4..HSET..$2...0..$14.......4.2...._4..$6....\S0`..*4..$4.
  .HSET..$2......$14.....dj.J.V..f4...$6....\S....*4..$4..HSET..$2..;...$14..#*..p.M..G .....$6....\S....*4..$4..H
  SET..$2......$4.....c..$6....\S{...*4..$4..HSET..$2..\...$4..u.....$6....\S....*4..$4..HSET..$2......$14..vd....
  ..Y...\...$6....\S....*4..$4..HSET..$2../...$4........$6....\S0...*4..$4..HSET..$2......$14...L.c.. ..._.mt..$6.
  ...\S....*4..$4..HSET..$2..v...$14....m.......H.....$6....\S....*4..$4..HSET..$2..]F..$4...X.b..$6....\S#...*4..
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  $4..HSET..$2......$14.. ....p%4.....1..$6....\S....*4..$4..HSET..$2......$14..".....).v...p=..$6....\S....*4..$4
  ..HSET..$2...V..$14...I@`oOC.?....9..$6....\Sb...*4..$4..HSET..$2......$4...c w..$6....\S....*4..$4..HSET..$2...
  ~..$14..]...S.....M.!K..$6....\S....*4..$4..HSET..$2......$4........$6....\S....*4..$4..HSET..$2...X..$14...(F..
  ..}...7.T..$6....\S. ..*4..$4..HSET..$2......$14...%....^dHZk..(..$6....\S(...*4..$4..HSET..$2...v..$4....%...$6
  ....\Ss...*4..$4..HSET..$2......$14...MB6,V..V[Db.6..$6....\S....*4..$4..HSET..$2......$14...MPL..-...ab.s..$6..
  ..\S....*4..$4..HSET..$2...f..$4..x%....$6....\S....*4..$4..HSET..$2......$14......{G...A.Y.I..$6....\S1...*4..$
  4..HSET..$2......$14.....?....a.......$6....\S....*4..$4..HSET..$2...K..$4..nR.%..$6....\S....*4..$4..HSET..$2..
  .X..$14...(F....}...7.T..$6....\S. ..*4..$4..HSET..$2......$14...`..E.......|)..$6....\S)...*4..$4..HSET..$2...v
  ..$4....%...$6....\Ss...*4..$4..HSET..$2......$14..n.X...4....f:K..$6....\SM...*4..$4..HSET..$2..%...$14..+r....
  =.Z.......$6....\S<...*4..$4..HSET..$2..+...$14..00L.G....h......$6....\S....*4..$4..HSET..$2..b...$4..x!....$6.
  ...\S....*4..$4..HSET..$2..'...$14..........j....9..$6....\Sf...*4..$4..HSET..$2......$14..i.....&.q..`....$6...
  .\S!...*4..$4..HSET..$2......$4..nZ>S..$6....\Sq...*4..$4..HSET..$2...p..$14...I..$.......w...$6....\S....*4..$4
  ..HSET..$2......$4....}...$6....\S....*4..$4..HSET..$2...&..$14..D..V3pLR.z].@...$6....\S$...*4..$4..HSET..$2...
  ...$14..R....P..Z.=k.-..$6....\S....*4..$4..HSET..$2......$14..a5...T...l...a..$6....\S....*4..$4..HSET..$2.....
  .$14..cW....^~...E....$6....\So...*4..$4..HSET..$2......$14...A....%......H..$6....\S....*4..$4..HSET..$2..i...$
  14....:.fs.....<.S..$6....\S....*4..$4..HSET..$2......$4.....:..$6....\S....*4..$4..HSET..$2..1c..$4..x!.[..$6..
  ..\S....*4..$4..HSET..$2...t..$14..R..H}.C.J....%..$6....\S.c..*4..$4..HSET..$2......$14..'...*.ais~H..r..$6....
  \S>...*4..$4..HSET..$2......$14..s...x...8U.m....$6....\S....*4..$4..HSET..$2..S...$4........$6....\S....*4..$4.
  .HSET..$2..U...$14...+... [..w..6...$6....\S....*4..$4..HSET..$2..D...$4..x+.1..$6....\S. ..*4..$4..HSET..$2..}.
  ..$14...$.W..,"(gYzh]..$6....\S....*4..$4..HSET..$2......$14....B...E..[.k}e..$6....\S....*4..$4..HSET..$2..xG..
  $4...).z..$6....\S.y..*4..$4..HSET..$2..E`..$14...P@.'.A..U.t.Y..$6....\SH...*4..$4..HSET..$2...}..$4..po....$6.
  ...\S....*4..$4..HSET..$2......$14...)v........9....$6....\Sr...*4..$4..HSET..$2...~..$4..^..x..$6....\S....*4..
  $4..HSET..$2..<...$14.....h.%....-.....$6....\S....*4..$4..HSET..$2..b'..$14......5Lz\M...z...$6....\S....*4..$4
  ..HSET..$2......$14......%.....5:.)..$6....\S `..*4..$4..HSET..$2..S*..$4..q.....$6....\S....*4..$4..HSET..$2...
  ...$14...Q.l...9..[R....$6....\S....*4..$4..HSET..$2...h..$4.....V..$6....\S. ..*4..$4..HSET..$2...B..$14..{.N.P
  W....,.O...$6....\S....*4..$4..HSET..$2......$14..cW....^~...E....$6....\Sp...*4..$4..HSET..$2...X..$14...(F....
  }...7.T..$6....\S. ..*4..$4..HSET..$2...'..$14......b.W...k.2O..$6....\S....*4..$4..HSET..$2......$4..u.. ..$6..
  ..\S....*4..$4..HSET..$2..-...$14....#.L2...'......$6....\S ...*4..$4..HSET..$2..b ..$14...u..7..@...\....$6....
  \S....*4..$4..HSET..$2..1c..$4..x!.[..$6....\S....*4..$4..HSET..$2...v..$4....%...$6....\St...*4..$4..HSET..$2..
  ....$14...k~......~U..9..$6....\S....*4..$4..HSET..$2..\6..$14.......cTv...=rJ..$6....\S....*4..$4..HSET..$2....
  ..$4...&>...$6....\S....
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  *4..$4..HSET..$2...:..$14...#......9.g.....$6....\SS...
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  *4..$4..HSET..$2......$14....'.G..\ .U..<..$6....\S{...
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  *4..$4..HSET..$2..C...$14..T.v..u.#.8...?..$6....\S....*4..$4..HSET..$2......$14..+.l...W....INV..$6....\S....*4
  ..$4..HSET..$2......$4..x%....$6....\SC...*4..$4..HSET..$2...x..$14....e.....e..h3z..$6....\Sf;..
#
T 10.0.1.50:32151 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.120:6379 -> 10.0.1.50:32151 [AP]
  *4..$4..HSET..$2..>...$4..po....$6....\S....
pedigree commented 10 years ago

Leaving it running for a while results in incomplete datasets and further timeouts that I dont see on redis-server slaves

[97718] 04-27 11:03:14,279 WARN Master connection timeout.
[97718] 04-27 11:03:15,281 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[97718] 04-27 11:03:15,281 INFO Send PSYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 35180800
[97718] 04-27 11:03:15,281 INFO Recv psync reply:CONTINUE
[97718] 04-27 11:05:14,328 INFO now = 1398596714, ping_recved_time=1398596654
[97718] 04-27 11:05:14,328 WARN Master connection timeout.
[97718] 04-27 11:05:15,330 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[97718] 04-27 11:05:15,330 INFO Send PSYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 35337105
[97718] 04-27 11:05:15,330 INFO Recv psync reply:CONTINUE
[97718] 04-27 11:07:14,368 INFO now = 1398596834, ping_recved_time=1398596774
[97718] 04-27 11:07:14,368 WARN Master connection timeout.
[97718] 04-27 11:07:15,369 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[97718] 04-27 11:07:15,370 INFO Send PSYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 35830301
[97718] 04-27 11:07:15,370 INFO Recv psync reply:CONTINUE
yinqiwen commented 10 years ago

It seems there is a bug when ardb's slave connection closed when doing full resync with redis server.

yinqiwen commented 10 years ago

And about the timeout error, what's the value of config 'repl-ping-slave-period' and 'repl-timeout' in redis instance?

yinqiwen commented 10 years ago

Commit a fix for the 'incomplete sync' bug.

pedigree commented 10 years ago

Replication isnt working yet but the numbers repl-ping-slave-period 60 repl-timeout 120

ardb-server logs this when starting the process (with slaveof configured)

[135915] 04-27 16:04:47,951 INFO Init storage engine success.
[135915] 04-27 16:04:47,952 WARN No zookeeper servers specified, zookeeper agent would not start.
[135915] 04-27 16:04:47,952 INFO Server started, Ardb version 0.7.1
[135915] 04-27 16:04:47,952 INFO The server is now ready to accept connections on port 2222
[135915] 04-27 16:04:47,955 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[135915] 04-27 16:04:47,955 INFO Send PSYNC 7a28f80b06f98ae619676a9c7ffee5ae02d5b2fa 1
[135915] 04-27 16:04:47,956 INFO Recv psync reply:FULLRESYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 56336513

redis master log shows this from the connection

[1040] 27 Apr 16:09:47.804 * Slave asks for synchronization
[1040] 27 Apr 16:09:47.804 * Partial resynchronization not accepted: Runid mismatch (Client asked for '7a28f80b06f98ae619676a9c7ffee5ae02d5b2fa', I'm '57b1d385ee78fd25afa5eba471834d4e30c7f6f3')
[1040] 27 Apr 16:09:47.804 * Starting BGSAVE for SYNC
[1040] 27 Apr 16:09:47.826 * Background saving started by pid 220447
[220447] 27 Apr 16:10:07.820 * DB saved on disk
[220447] 27 Apr 16:10:07.838 * RDB: 31 MB of memory used by copy-on-write
[1040] 27 Apr 16:10:07.911 * Background saving terminated with success

ngrep logs this and only a small amount of replication happens but only after the process has segfaults. I see about 120kb of compress data before the replication stops.. The ardb-server process, it segfaults and core dumps before it sees the REDIS0006 tcp stream

# ngrep -d eth1 host 10.0.1.50 and port 6379
interface: eth1 (10.0.1.0/255.255.255.0)
filter: (ip or ip6) and ( host 10.0.1.50 and port 6379 )
#
T 10.0.1.50:53245 -> 10.0.1.120:6379 [AR]
  ......
###
T 10.0.1.50:53737 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [AP]
  info Server..
##
T 10.0.1.120:6379 -> 10.0.1.50:53737 [AP]
  $412..# Server..redis_version:2.8.7..redis_git_sha1:00000000..redis_git_dirty:0..redis_build_id:4e504c
  85df90d66d..redis_mode:standalone..os:Linux 3.14.0-031400-generic x86_64..arch_bits:64..multiplexing_a
  pi:epoll..gcc_version:4.8.1..process_id:1040..run_id:57b1d385ee78fd25afa5eba471834d4e30c7f6f3..tcp_por
  t:6379..uptime_in_seconds:410579..uptime_in_days:4..hz:10..lru_clock:1449466..config_file:/etc/redis/6
  379.conf....
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [AP]
  replconf listening-port 2222..
#
T 10.0.1.120:6379 -> 10.0.1.50:53737 [AP]
  +OK..
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [AP]
  psync 7a28f80b06f98ae619676a9c7ffee5ae02d5b2fa 1..
#
T 10.0.1.120:6379 -> 10.0.1.50:53737 [AP]
  +FULLRESYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 56987551..
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.120:6379 -> 10.0.1.50:53737 [AP]
  .
#
T 10.0.1.50:53737 -> 10.0.1.120:6379 [A]
  ......
#
T 10.0.1.120:6379 -> 10.0.1.50:53737 [AP]
  $1175338732..
#
T 10.0.1.120:6379 -> 10.0.1.50:53737 [A]
  REDIS0006..........D....K...K... ..d........v..Fx6y.E..ZG9S...........K.;]..c&..o.[S..........h......H
  ....SNS. /.|r.4Wj.tcz..\%...T.R. ..?y........... .....R. ..B*d.M.`...f.dz../.fR. ......%"d.l...^...E..
  R. ...).JZa...0lPl...](.R. ..L,...s^....&U*..S..R. ...a..6#.....8.....f.R. ...#Cbc....Y..W......`.....
  K0...W.^.d....4.Q@..I......U..R......uz@......K....b@m....-.....1R. c...4..VO...Duw...m..@3..y.4.....R
  .ua.......Q@..O1.f}.G~....[F..:.MS@.....\.qo.2.d... ..G.`G..,.....x..)..-....?a;..GMA.A.5.O_........`_
  ..........bB.......w`G..K'.....Ty. ......x`....r$.".5........Vfy`...Ss.@.~>.6d.l........p9./k.Nt...(.:
  ....a#....).>!A.F.0..]...#{`G..A........./.0...Y...I..'....&.|(t /...|`/.a...*{.....ka...%.}`..2C.+W(.
  .H..:k........I..#.=..........$.}a..^..(.`...}.&.......Q. .......|..f..&....e..QB[....g.....T.C.......
  `w.`.T"t(..F..Ae....6.`......m....3.#%b..........L*...O.>.....r....#..'a....Pj.l....l.`G...9{jH.%..r .
  ....a.Q. .........~6.............x.Cf...H.d}. ..O....4\...`....1.x G.e.......a.W.\.Ca..U.../.`w...$.M.
  @....*.#K..E...4.*...&.q..-.. ....`/..8M...r[.?.........`......;F..a..N.$#....QD#.....h......e:#...s.`
  /..Y5l^>..................XV...f....i........N...G3.D16..?"....`G...|)...].X2..$...n...Ug...S..eZ.%X..
  ..f.`/...i..{.S....j............;....~'.^......`/. S~...... `../...;.`.....7.F....\.@K /.T...pA0r. U.+
  ....M...}...T....[x..%k.,......a..0...I._.#...O...8..`_./.G4...E.....1"......zI...Mb...@..J.......B'.'
  ..b.y.9..9#....`G.....#.....O.......`...u........

Continues on for 13 x 9kb packets, then stops

The segfault happens after the process starts

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[137412] 04-27 16:10:28,119 INFO Start init storage engine.
[New Thread 0x7fff373ff700 (LWP 137413)]
[137412] 04-27 16:10:28,123 INFO Init storage engine success.
[New Thread 0x7fff353fe700 (LWP 137414)]
[137412] 04-27 16:10:28,125 WARN No zookeeper servers specified, zookeeper agent would not start.
[137412] 04-27 16:10:28,125 INFO Server started, Ardb version 0.7.1
[137412] 04-27 16:10:28,125 INFO The server is now ready to accept connections on port 2222
[New Thread 0x7fff343ff700 (LWP 137415)]
[New Thread 0x7fff33bfe700 (LWP 137416)]
[137412] 04-27 16:10:28,128 INFO [Slave]Remote master is a Redis 2.8.7 instance, support partial sync:1
[137412] 04-27 16:10:28,128 INFO Send PSYNC 7a28f80b06f98ae619676a9c7ffee5ae02d5b2fa 1
[137412] 04-27 16:10:28,128 INFO Recv psync reply:FULLRESYNC 57b1d385ee78fd25afa5eba471834d4e30c7f6f3 56987551
[New Thread 0x7fff327ff700 (LWP 137417)]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff791464d in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0x00007ffff791464d in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007ffff79741f9 in std::string::assign(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00000000004cff3d in operator= (__str=..., this=0x7fffffffe418) at /usr/include/c++/4.8/bits/basic_string.h:547
#3  ardb::Slave::HandleRedisReply (this=0x7fffffffe2c0, ch=0x7ffff67c8dc0, reply=...) at replication/slave.cpp:302
#4  0x00000000004a5c73 in HandleStreamEvent (e=..., ctx=..., this=<optimised out>)
    at ./channel/channel_upstream_handler.hpp:133
#5  ardb::ChannelPipeline::SendUpstream<ardb::codec::RedisMessage> (ctx=<optimised out>, e=..., this=<optimised out>)
    at channel/all_includes.hpp:89
#6  0x00000000004b2f3d in SendUpstream<ardb::MessageEvent<ardb::codec::RedisMessage> > (this=<optimised out>,
    this=<optimised out>, e=...) at channel/all_includes.hpp:164
#7  fire_message_received<ardb::codec::RedisMessage> (destructor=0x0, message=0x7fffffffd7e0, ctx=...)
    at ./channel/channel_helper.hpp:91
#8  ardb::codec::StackFrameDecoder<ardb::codec::RedisMessage>::CallDecode (this=0x7fffffffe300, context=...,
    channel=0x7ffff67c8dc0, cumulation=...) at ./channel/codec/stack_frame_decoder.hpp:102
#9  0x00000000004b353f in ardb::codec::StackFrameDecoder<ardb::codec::RedisMessage>::MessageReceived (this=0x7fffffffe300,
    ctx=..., e=...) at ./channel/codec/stack_frame_decoder.hpp:157
#10 0x0000000000505103 in HandleStreamEvent (e=..., ctx=..., this=<optimised out>)
    at ./channel/channel_upstream_handler.hpp:133
#11 ardb::ChannelPipeline::SendUpstream<ardb::Buffer> (ctx=<optimised out>, e=..., this=<optimised out>)
    at ./channel/all_includes.hpp:89
#12 0x0000000000506550 in SendUpstream<ardb::MessageEvent<ardb::Buffer> > (event=..., this=0x7ffff67c8df8)
    at ./channel/all_includes.hpp:128
#13 fire_message_received<ardb::Buffer> (destructor=0x0, message=0x7ffff67c8e28, channel=0x7ffff67c8dc0)
    at ./channel/channel_helper.hpp:83
#14 ardb::Channel::OnRead (this=0x7ffff67c8dc0) at channel/channel.cpp:500
#15 0x0000000000504f2d in ardb::Channel::IOEventCallback (eventLoop=<optimised out>, fd=<optimised out>,
    clientData=0x7ffff67c8dc0, mask=1) at channel/channel.cpp:50
#16 0x0000000000521870 in aeProcessEvents (eventLoop=eventLoop@entry=0x7ffff6439158, flags=flags@entry=3)
    at channel/redis/ae.c:429
#17 0x0000000000521b5b in aeMain (eventLoop=0x7ffff6439158) at channel/redis/ae.c:485
#18 0x00000000004a93de in ardb::ArdbServer::Start (this=this@entry=0x7fffffffde20, props=...) at ardb_server.cpp:843
#19 0x0000000000407fd8 in main (argc=<optimised out>, argv=<optimised out>) at main.cpp:110
yinqiwen commented 10 years ago

Have you 'make clean' after update src from github?

yinqiwen commented 10 years ago

For the timeout error, you should also change the 'repl-timeout' value to 120 in ardb.conf, it has a default value 60 which would generate an error if no 'ping' received from master in 60 secs.

pedigree commented 10 years ago

it looks like a rm of the ardb folder and a complete pull from git has fixed the segfaulting. I'm testing the replication now. Is there a way to prioritize then loading of the rdb file into the storage_engine or does that not matter?

yinqiwen commented 10 years ago

You mean a redis command to load rdb file? There is a 'import' command to load rdb file. redis> import /tmp/data/dump.ardb OK

pedigree commented 10 years ago

the rbd (1.1gb) is taking a very long time to load, almost an hour and I was just wondering about key availability during that time. If thre is no problem and replication works as well as testing is showing, then it looks like I'll be putting it into production on my website :)

pedigree commented 10 years ago

Replication has been working well but I'm concerned about the time it takes to submit keys to file storage. My preferred lmdb can take hours to commit after initial replication. Is there a way to speed the commit to disk up and are keys available immediately after the RDB has started disk commit?

yinqiwen commented 10 years ago

It could be faster , but can only 10% or 20% faster,which is still very slow for huge data set. That depends on the random write performance of the storage engine. The total write operation number executed in replication is "keys num + elements num in set/list/hash/zset".

pedigree commented 10 years ago

While its dumping to Lmdb/rocksdb (etc), are the keys from the RDB available to query?

On 11/08/2014 04:51, yinqiwen wrote:

It could be faster , but can only 10% or 20% faster,which is still very slow for huge data set. That depends on the random write performance of the storage engine. The total write operation number executed in replication is "keys num + elements num in set/list/hash/zset".

— Reply to this email directly or view it on GitHub https://github.com/yinqiwen/ardb/issues/42#issuecomment-51739843.

yinqiwen commented 10 years ago

Each key is available to query after it's been loaded from RDB, not after the whole RDB loaded.