OpenAtomFoundation / pika

Pika is a Redis-Compatible database developed by Qihoo's infrastructure team.
BSD 3-Clause "New" or "Revised" License
5.76k stars 1.19k forks source link

rsync response error when upgrade from 3.3.6 to 3.5.3 #2740

Closed epubreader closed 2 weeks ago

epubreader commented 2 weeks ago

Is this a regression?

Yes

Description

I20240617 02:46:09.066272 297 pika_repl_client.cc:138] Try Send Meta Sync Request to Master (pika:9221) I20240617 02:46:09.068120 250 pika_server.cc:442] Mark try connect finish I20240617 02:46:09.068161 250 pika_repl_client_conn.cc:151] Finish to handle meta sync response I20240617 02:46:09.239224 251 pika_repl_client_conn.cc:182] DB: db0 Need Wait To Sync W20240617 02:46:09.266966 297 pika_rm.cc:510] ActivateRsync ... W20240617 02:46:09.312813 303 rsync_client.h:224] rsync response error W20240617 02:46:10.358121 303 rsync_client.h:224] rsync response error W20240617 02:46:11.402184 303 rsync_client.h:224] rsync response error W20240617 02:46:12.446231 303 rsync_client.h:224] rsync response error W20240617 02:46:13.490192 303 rsync_client.h:224] rsync response error W20240617 02:46:14.534222 303 rsync_client.h:224] rsync response error W20240617 02:46:15.578155 303 rsync_client.h:224] rsync response error W20240617 02:46:16.622195 303 rsync_client.h:224] rsync response error W20240617 02:46:17.666244 303 rsync_client.h:224] rsync response error W20240617 02:46:18.710255 303 rsync_client.h:224] rsync response error W20240617 02:46:19.710453 297 rsync_client.cc:276] copy remote meta failed! error:IO error: kRsyncMeta request failed! db is not exist or doing bgsave W20240617 02:46:19.710505 297 rsync_client.cc:67] RsyncClient recover failed W20240617 02:46:21.272990 297 pika_rm.cc:969] Slave DB: db0 rsync failed! full synchronization will be retried later I20240617 02:47:38.205807 247 client_thread.cc:222] Do cron task del fd 1553 I20240617 02:47:38.205933 247 pika_repl_client_thread.cc:36] ReplClient Timeout conn, fd=1553, ip_port=pika:11221 W20240617 02:47:38.205967 247 pika_repl_client_thread.cc:47] Master conn timeout : pika:11221 try reconnect I20240617 02:47:48.043236 297 pika_repl_client.cc:138] Try Send Meta Sync Request to Master (pika:9221) I20240617 02:47:48.045104 252 pika_server.cc:442] Mark try connect finish I20240617 02:47:48.045133 252 pika_repl_client_conn.cc:151] Finish to handle meta sync response I20240617 02:47:48.393524 253 pika_repl_client_conn.cc:182] DB: db0 Need Wait To Sync W20240617 02:47:48.444372 297 pika_rm.cc:510] ActivateRsync ... W20240617 02:47:48.486294 304 rsync_client.h:224] rsync response error W20240617 02:47:49.530292 304 rsync_client.h:224] rsync response error W20240617 02:47:50.574182 304 rsync_client.h:224] rsync response error W20240617 02:47:51.618232 304 rsync_client.h:224] rsync response error W20240617 02:47:52.662254 304 rsync_client.h:224] rsync response error W20240617 02:47:53.706166 304 rsync_client.h:224] rsync response error W20240617 02:47:54.750165 304 rsync_client.h:224] rsync response error W20240617 02:47:55.794257 304 rsync_client.h:224] rsync response error W20240617 02:47:56.838239 304 rsync_client.h:224] rsync response error W20240617 02:47:57.882184 304 rsync_client.h:224] rsync response error W20240617 02:47:58.882324 297 rsync_client.cc:276] copy remote meta failed! error:IO error: kRsyncMeta request failed! db is not exist or doing bgsave W20240617 02:47:58.882375 297 rsync_client.cc:67] RsyncClient recover failed W20240617 02:48:00.450968 297 pika_rm.cc:969] Slave DB: db0 rsync failed! full synchronization will be retried later

I20240617 02:49:20.265908 247 client_thread.cc:222] Do cron task del fd 1754 I20240617 02:49:20.266433 247 pika_repl_client_thread.cc:36] ReplClient Timeout conn, fd=1754, ip_port=pika:11221 W20240617 02:49:20.266463 247 pika_repl_client_thread.cc:47] Master conn timeout : pika:11221 try reconnect I20240617 02:49:30.029565 297 pika_repl_client.cc:138] Try Send Meta Sync Request to Master (pika:9221) I20240617 02:49:30.031545 254 pika_server.cc:442] Mark try connect finish I20240617 02:49:30.031589 254 pika_repl_client_conn.cc:151] Finish to handle meta sync response I20240617 02:49:30.174350 255 pika_repl_client_conn.cc:182] DB: db0 Need Wait To Sync W20240617 02:49:30.230262 297 pika_rm.cc:510] ActivateRsync ... W20240617 02:49:30.274298 305 rsync_client.h:224] rsync response error W20240617 02:49:31.318192 305 rsync_client.h:224] rsync response error W20240617 02:49:32.362076 305 rsync_client.h:224] rsync response error W20240617 02:49:33.406157 305 rsync_client.h:224] rsync response error W20240617 02:49:34.450140 305 rsync_client.h:224] rsync response error W20240617 02:49:35.494212 305 rsync_client.h:224] rsync response error W20240617 02:49:36.538182 305 rsync_client.h:224] rsync response error W20240617 02:49:37.582218 305 rsync_client.h:224] rsync response error W20240617 02:49:38.626216 305 rsync_client.h:224] rsync response error W20240617 02:49:39.670192 305 rsync_client.h:224] rsync response error W20240617 02:49:40.670375 297 rsync_client.cc:276] copy remote meta failed! error:IO error: kRsyncMeta request failed! db is not exist or doing bgsave W20240617 02:49:40.670440 297 rsync_client.cc:67] RsyncClient recover failed W20240617 02:49:42.236953 297 pika_rm.cc:969] Slave DB: db0 rsync failed! full synchronization will be retried later

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

No response

Anything else?

No response

epubreader commented 2 weeks ago

docker service create \ --name pika-slave \ --publish 9223:9221 \ --mount type=bind,source=/opt/pika/conf/pika.slave.conf,target=/pika/conf/pika.conf \ --mount type=bind,source=/opt/pika/db,target=/pika/db \ --mount type=bind,source=/opt/pika/dump,target=/pika/dump \ --mount type=bind,source=/opt/pika/log,target=/pika/log \ --mount type=bind,source=/opt/pika/dbsync,target=/pika/dbsync \ --network back \ --constraint 'node.labels.name==mongodb2' \ --mode global \ --restart-condition on-failure \ --with-registry-auth \ pikadb/pika:3.5.3 \

我用的是docker swarm, 然后新增slave节点时, 会全量同步, 把dbsync重命名为db 这一步, 会报错,之前的版本是没有问题的 --mount type=bind,source=/opt/epub/pika/log,target=/pika/log \ --mount type=bind,source=/opt/epub/pika/dbsync,target=/pika/dbsync \

Issues-translate-bot commented 2 weeks ago

Bot detected the issue body's language is not English, translate it automatically.


docker service create \ --name pika-slave \ --publish 9223:9221 \ --mount type=bind,source=/opt/epub/pika/conf/pika.slave.conf,target=/pika/conf/pika.conf \ --mount type=bind,source=/opt/epub/pika/db,target=/pika/db \ --mount type=bind,source=/opt/epub/pika/dump,target=/pika/dump \ --mount type=bind,source=/opt/epub/pika/log,target=/pika/log \ --mount type=bind,source=/opt/epub/pika/dbsync,target=/pika/dbsync \ --network back \ --constraint 'node.labels.name==mongodb2' \ --mode global \ --restart-condition on-failure \ --with-registry-auth \ pikadb/pika:3.5.3 \

I use docker swarm, and then when adding a slave node, it will be fully synchronized. When renaming dbsync to db, an error will be reported. There was no problem with the previous version. --mount type=bind,source=/opt/epub/pika/log,target=/pika/log \ --mount type=bind,source=/opt/epub/pika/dbsync,target=/pika/dbsync \

cheniujh commented 2 weeks ago

补充一个误删除的内容: image

cheniujh commented 2 weeks ago

Bot detected the issue body's language is not English, translate it automatically.

docker service create \ --name pika-slave \ --publish 9223:9221 \ --mount type=bind,source=/opt/epub/pika/conf/pika.slave.conf,target=/pika/conf/pika.conf \ --mount type=bind,source=/opt/epub/pika/db,target=/pika/db \ --mount type=bind,source=/opt/epub/pika/dump,target=/pika/dump \ --mount type=bind,source=/opt/epub/pika/log,target=/pika/log \ --mount type=bind,source=/opt/epub/pika/dbsync,target=/pika/dbsync \ --network back \ --constraint 'node.labels.name==mongodb2' \ --mode global \ --restart-condition on-failure \ --with-registry-auth \ pikadb/pika:3.5.3 \

I use docker swarm, and then when adding a slave node, it will be fully synchronized. When renaming dbsync to db, an error will be reported. There was no problem with the previous version. --mount type=bind,source=/opt/epub/pika/log,target=/pika/log --mount type=bind,source=/opt/epub/pika/dbsync,target=/pika/dbsync \

这里将pika的各个主要目录都做了mount,挂载到了不同的文件系统,这样会带来很多麻烦,一个是你的case:不能跨越文件系统rename导致全量同步失败,二是做bgsave生成dump时,原来是同一个文件系统生成硬链接即可,这样跨越文件系统会实际发生copy。