Open qibinghua opened 9 years ago
配置文件贴一下, 还有你能想到的所有信息, 你的操作, 你看到的, 你想到的, 不要吝惜.
配置文件如下:
# cat ssdb.conf
# ssdb-server config
# MUST indent by TAB!
# relative to path of this file, directory must exists
work_dir = ./var
pidfile = ./var/ssdb.pid
server:
ip: 1.1.1.1
port: 8888
# bind to public ip
#ip: 0.0.0.0
# format: allow|deny: all|ip_prefix
# multiple allows or denys is supported
#deny: all
#allow: 127.0.0.1
#allow: 192.168
replication:
slaveof:
# to identify a master even if it moved(ip, port changed)
# if set to empty or not defined, ip:port will be used.
id: svc_2
# sync|mirror, default is sync
type: mirror
ip: 2.2.2.2
port: 8888
logger:
level: info
output: log.txt
rotate:
size: 1000000000
leveldb:
# in MB
cache_size: 500
# in KB
block_size: 32
# in MB
write_buffer_size: 64
# in MB
compaction_speed: 1000
# yes|no
compression: no
互为主从。 mirror模式。
信息不够, 请继续提供信息.
请提供如下信息:
目前整个db是跑在阿里云的服务器上的.配置比较高8核8G的,临时磁盘. 阿里云网络高峰时期偶尔会抖动
最早是双主结构,A1和A2,后来因为A1和A2配置比较低,并且是云磁盘io比较差.所以上了2台新的机器M1和S1
先是将A1的数据同步到了M1,然后将A1下架.再将M1和A2做了双master,S1做slave,后来觉得A2没有必要,就把A2下架了.然后M1一直会报连到A2错误的log..但服务一直正常.
M1和S1也运行一直正常.S1也能正常同步到数据.
接着在第三台机器上运行定时脚本从这个slave做dump备份,每天凌晨3点开始.
中间将SSDB从1.6.6升级到了1.6.8
WEB-------write------>M1-----------sync------->S1
|
WEB<------read---------------------------------
主要存储聊天的消息记录(hset),联系人的存储(zset),还有就是一些帖子的看过记录(zset) Master平均总共一天估计有100w-200w的写入
因为S1没做监控,所以磁盘出现满的时间未知,通过日志是看到在2014-12-03 4点多开始,log里再没有日志,S1开始出现refused connect.. 这个时候M1写入是正常
在前面升级到1.6.8后,一直没对SSDB本身以及配置做任何更改,服务如以前一样稳步运行.
将S1的读取通过本地hosts指向切换到M1,并将S1的log文件拷贝出来后,删除了整个data目录,将S1的配置文件里compaction_speed这做了修改后重启S1.重启后同步数据正常.
version
1.6.8.8
links
1
total_calls
300148
key_range.kv
"test" - "test20140722"
key_range.hash
"bl:u:1000030" - "wbsm:u"
key_range.zset
"bl:m:1000007" - "sv:u:999"
key_range.list
"" - ""
leveldb.stats
Compactions
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 0 0 720 0 40269
1 5 136 2159 93786 92960
2 51 1578 10483 400010 391177
3 491 15994 1309 50801 45835
4 251 7937 0 0 0
17 result(s) (0.001 sec)
# ssdb-server config
# MUST indent by TAB!
# relative to path of this file, directory must exists
work_dir = /data1/ssdb
pidfile = /data1/ssdb/ssdb.pid
server:
ip: 0.0.0.0
port: 8001
# bind to public ip
#ip: 0.0.0.0
# format: allow|deny: all|ip_prefix
# multiple allows or denys is supported
deny: all
allow: 127.0.0.1
allow: 10.
replication:
binlog: yes
# Limit sync speed to *MB/s, -1: no limit
sync_speed: -1
slaveof:
# to identify a master even if it moved(ip, port changed)
# if set to empty or not defined, ip:port will be used.
#id: svc_2
# sync|mirror, default is sync
type: sync
ip: 10.161.245.161
port: 8001
logger:
level: info
output: /data1/logs/log.txt
rotate:
size: 1000000000
leveldb:
# in MB
cache_size: 2048
# in KB
block_size: 64
# in MB
write_buffer_size: 64
# in MB
compaction_speed: 1000
#后面的配置我这里改成4096了..
# yes|no
compression: no
version
1.6.8.8
links
131
total_calls
9030140
key_range.kv
"test" - "test20140722"
key_range.hash
"bl:u:1000030" - "wbsm:u"
key_range.zset
"bl:m:1000007" - "sv:u:999"
key_range.list
"" - ""
leveldb.stats
Compactions
Level Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
0 0 0 2 0 35
1 3 106 29 37 577
2 62 1572 72 1060 996
3 620 15984 4220 45044 44988
4 1199 38110 8703 96820 96699
17 result(s) (0.001 sec)
# ssdb-server config
# MUST indent by TAB!
# relative to path of this file, directory must exists
work_dir = /data1/ssdb
pidfile = /data1/ssdb/ssdb.pid
server:
ip: 0.0.0.0
port: 8001
# bind to public ip
#ip: 0.0.0.0
# format: allow|deny: all|ip_prefix
# multiple allows or denys is supported
deny: all
allow: 127.0.0.1
allow: 10.
replication:
binlog: yes
# Limit sync speed to *MB/s, -1: no limit
sync_speed: -1
slaveof:
# to identify a master even if it moved(ip, port changed)
# if set to empty or not defined, ip:port will be used.
#id: svc_2
# sync|mirror, default is sync
#type: mirror
#ip: 10.161.217.246
#port: 8001
logger:
level: info
output: /data1/logs/log.txt
rotate:
size: 1000000000
leveldb:
# in MB
cache_size: 2048
# in KB
block_size: 64
# in MB
write_buffer_size: 64
# in MB
compaction_speed: 4196
# yes|no
compression: no
配置文件如下: Master :
work_dir = ./var pidfile = ./var/ssdb.pid
server: ip: 10.100.100.228 port: 8888
#ip: 0.0.0.0
# format: allow|deny: all|ip_prefix
# multiple allows or denys is supported
#deny: all
#allow: 127.0.0.1
#allow: 192.168
replication: slaveof:
# if set to empty or not defined, ip:port will be used.
id: svc_2
# sync|mirror, default is sync
type: mirror
ip: 10.100.100.229
port: 8889
logger: level: info output: log.txt rotate: size: 1000000000
leveldb:
cache_size: 500
# in KB
block_size: 32
# in MB
write_buffer_size: 64
# in MB
compaction_speed: 1000
# yes|no
compression: no
Slaver :
work_dir = ./var pidfile = ./var/ssdb.pid
server: ip: 10.100.100.229 port: 8889
#ip: 0.0.0.0
# format: allow|deny: all|ip_prefix
# multiple allows or denys is supported
# deny: all
# allow: 127.0.0.1
# allow: 192.168
replication: slaveof:
# if set to empty or not defined, ip:port will be used.
id: svc_1
# sync|mirror, default is sync
type: mirror
ip: 10.100.100.228
port: 8888
logger: level: info output: log.txt rotate: size: 1000000000
leveldb:
cache_size: 500
# in KB
block_size: 32
# in MB
write_buffer_size: 64
# in MB
compaction_speed: 1000
# yes|no
compression: no
应用都是单节点写入的。读写请求都在100.228的机器上(主库)。100.229在这个架构中只负责实时同步。并没有提供插入功能。
日志无任何报错信息。
两台db服务器:内网连接,master占用55G,slave占用了500G. slave开始拒绝连接.
日志看起来起一切正常.,因为开的info级别,最后的日志如下:
另外一个同学也是这样的情况 ,也没有发现ERROR错误