microsoftarchive / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes
http://redis.io
Other
20.81k stars 5.37k forks source link

Redis Sentinel failing when using with mutilple Sentinel under a single port #324

Open tellan55 opened 8 years ago

tellan55 commented 8 years ago

Hi,

I am looking to use a single Sentinel service to monitor many sentinel configurations so I am defining a single port with multiple Sentinel configuration or multiple sentinel groups. It runs just fine, however when we query the Sentinel API in this witness sentinel server it fails with the following bug report:

=== REDIS BUG REPORT START: Cut & paste starting from here === [2576] 28 Sep 15:06:56.059 # Out Of Memory allocating 32777 bytes. [2576] 28 Sep 15:06:56.059 # --- ABORT [2576] 28 Sep 15:06:56.059 # --- STACK TRACE redis-server.exe!LogStackTrace(c:\release\redis\src\win32_interop\win32_stacktrace.cpp:95)(0x00000016, 0x000003D4, 0x00000000, 0x00000001) redis-server.exe!AbortHandler(c:\release\redis\src\win32_interop\win32_stacktrace.cpp:206)(0x00000016, 0x7746DAA6, 0x40150880, 0x7754912A) redis-server.exe!raise(f:\dd\vctools\crt\crtw32\misc\winsig.c:587)(0x00000001, 0x00000000, 0x00008009, 0x00000000) redis-server.exe!abort(f:\dd\vctools\crt\crtw32\misc\abort.c:82)(0x00008009, 0x4013F888, 0x00008009, 0x0000001E) redis-server.exe!redisOutOfMemoryHandler(c:\release\redis\src\redis.c:3397)(0xFE5E5410, 0x00000001, 0x00000018, 0x000003D4) redis-server.exe!zrealloc(c:\release\redis\src\zmalloc.c:183)(0x00008000, 0x00000000, 0x00000000, 0x00000001) redis-server.exe!sdsMakeRoomFor(c:\release\redis\src\sds.c:146)(0x00004000, 0x00000008, 0x00000000, 0x00000001) redis-server.exe!readQueryFromClient(c:\release\redis\src\networking.c:1306)(0xFFF37A90, 0x00000001, 0xFE204920, 0x000000D4) redis-server.exe!aeMain(c:\release\redis\src\ae.c:487)(0x4013B14C, 0x4013B14C, 0x00000000, 0x00000001) redis-server.exe!redis_main(c:\release\redis\src\redis.c:3524)(0x00224960, 0x00000005, 0x56098B2A, 0x00000000) redis-server.exe!main(c:\release\redis\src\win32_interop\win32_qfork.cpp:1369)(0x00000008, 0xFFFFFFFF, 0x00000008, 0x00000000) redis-server.exe!ServiceWorkerThread(c:\release\redis\src\win32_interop\win32_service.cpp:485)(0x00000000, 0x00000000, 0x00000000, 0x00000000) kernel32.dll!BaseThreadInitThunk(c:\release\redis\src\win32_interop\win32_service.cpp:485)(0x00000000, 0x00000000, 0x00000000, 0x00000000) ntdll.dll!RtlUserThreadStart(c:\release\redis\src\win32_interop\win32_service.cpp:485)(0x00000000, 0x00000000, 0x00000000, 0x00000000) ntdll.dll!RtlUserThreadStart(c:\release\redis\src\win32_interop\win32_service.cpp:485)(0x00000000, 0x00000000, 0x00000000, 0x00000000) [2576] 28 Sep 15:06:56.262 # === REDIS BUG REPORT END. Make sure to include from START to END. ===

I also have the mini memory *dmp available if it is needed. This Witness server it is not running Redis services, only Redis Sentinel with Redis version 2.8.21.03 Any recommendation will be appreciate it

Thanks

enricogior commented 8 years ago

Hi @tellan55, the crashed is caused by a memory allocation failure. When running as a sentinel, redis uses a fix amount of memory preallocated, we should definitely change that. If you want I can send you a private build with a fix for it.

tellan55 commented 8 years ago

Hi @Enricogior

That would be great. Please let me know how can I get the private build or send it to this account if the size of the attachment permit it.

Thank you

On Sep 28, 2015, at 4:49 PM, Enrico Giordani notifications@github.com wrote:

Hi @tellan55, the crashed is caused by a memory allocation failure. When running as a sentinel, redis uses a fix amount of memory preallocated, we should definitely change that. If you want I can send you a private build with a fix for it.

— Reply to this email directly or view it on GitHub.

enricogior commented 8 years ago

@tellan55, the MSI and the stand-alone binaries are available here: http://1drv.ms/1P3Jbyo Let me know if the fix solves the problem. Thank you.

tellan55 commented 8 years ago

@Enrico,

Thanks I just finally downloaded and will be testing it tomorrow.

Rewards

On Sep 29, 2015, at 9:59 AM, Enrico Giordani notifications@github.com wrote:

@tellan55, the MSI and the stand-alone binaries are available here: http://1drv.ms/1P3Jbyo Let me know if the fix solves the problem. Thank you.

— Reply to this email directly or view it on GitHub.

tellan55 commented 8 years ago

@Enricogior

The private build did not work. The error is the same and it is monitoring 3 sentinels . Just more information when this happens is when the Sentinel API in these witness servers is use to determine what Master is in use.

Please, let me know if you need the error.

Thank you

On Sep 29, 2015, at 9:59 AM, Enrico Giordani notifications@github.com wrote:

@tellan55, the MSI and the stand-alone binaries are available here: http://1drv.ms/1P3Jbyo Let me know if the fix solves the problem. Thank you.

— Reply to this email directly or view it on GitHub.

enricogior commented 8 years ago

Hi @tellan55, that is strange since I changed completely the memory management for the sentinel, can you please verify in the first line of the log if it's the new build '2.8.2103-sentinel-fix' running? Than you.

tellan55 commented 8 years ago

Hi @Enricorgior

Yes it is that build. I will do more testing and let you know.

Keep you informed

On Oct 1, 2015, at 4:34 AM, Enrico Giordani notifications@github.com wrote:

Hi @tellan55, that is strange since I changed completely the memory management for the sentinel, can you please verify in the first line of the log if it's the new build '2.8.2103-sentinel-fix' running? Than you.

— Reply to this email directly or view it on GitHub.

enricogior commented 8 years ago

@tellan55 can you please post the sentinel config file? Thank you.

tellan55 commented 8 years ago

Hi @Enricogior

Sorry for the delay,

Let me explain our configuration:

For business reasons we are deploying Redis instances per application. These multiple Redis Instances are basically a Master and a Slave, both also with Redis Sentinel.

Each Redis instances also have the "bind 0.0.0.0" as per another case in this blog.

These witness Sentinel servers (2 different servers) are running with this sentinel.conf file in both:

port 10099

########xxx SENTINEL####################

sentinel monitor XXX 192.168.22.73 6379 2

sentinel down-after-milliseconds XXX 15000

sentinel failover-timeout XXX 60000

sentinel parallel-syncs XXX 1

sentinel auth-pass XXX d$0r=G#l.d?M_XXXCFG

########yyy SENTINEL####################

sentinel monitor YYY 192.168.22.85 6379 2

sentinel down-after-milliseconds YYY 15000

sentinel failover-timeout YYY 60000

sentinel parallel-syncs YYY 1

sentinel auth-pass YYY ;D9Dz;i6?8d|_YYY

################zzz SENTINEL####################

sentinel monitor zzz 192.168.6.40 6379 2

sentinel down-after-milliseconds zzz 15000

sentinel failover-timeout zzz 60000

sentinel parallel-syncs zzz 1

sentinel auth-pass zzz ZZZ,zk-,M8M?==:/

################# GENERAL #################

dir "C:\HERE\SENTINEL"

logfile "C:\HERE\SENTINEL\redis-sentinel.log"

###########################################

These 2 witness Sentinels run just fine, however when we query the SentinelAPI on any of these 2 servers to identify where the Master is for any of the Sentinel groups, it stops the Windows Sentinel services and drops that memory error in the sentinel logs.

Thanks for looking into this and please let me know if you need any further information.

On Thu, Oct 1, 2015 at 5:27 PM, Enrico Giordani notifications@github.com wrote:

@tellan55 https://github.com/tellan55 can you please post the sentinel config file? Thank you.

— Reply to this email directly or view it on GitHub https://github.com/MSOpenTech/redis/issues/324#issuecomment-144855231.

enricogior commented 8 years ago

@tellan55 I haven't been able to reproduce the problem yet. Can you please run the INFO command against the sentinel service and post here the output? What exact command does cause the crash? Thank you.

tellan55 commented 8 years ago

Hi @enricogior

Here is the INFO from one of the witness Sentinel:

D:\Redis>redis-cli -p 10099 -h 127.0.0.1 127.0.0.1:10099> info

Server

redis_version:2.8.2103-sentinel-fix redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:255a120e16cf03e8 redis_mode:sentinel os:Windows arch_bits:64 multiplexing_api:winsock_IOCP process_id:5656 run_id:5ba3fa1944a337404fe929de0cef58171d07abdd tcp_port:10099 uptime_in_seconds:38572 uptime_in_days:0 hz:11 lru_clock:1200212 config_file:D:\Redis\sentinel.conf

Sentinel

sentinel_masters:3 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 master0:name=XXX,status=ok,address=192.168.22.73:6379,slaves=1,sentinels=4 master1:name=YYY,status=ok,address=192.168.22.85:6379,slaves=1,sentinels=4 master2:name=ZZZ,status=ok,address=192.168.6.40:6379,slaves=1,sentinels=4 127.0.0.1:10099>

We are suing the client from: https://github.com/StackExchange/StackExchange.Redis/tree/89d3c3b34fca6be15a6d0245fdb5e8ff9fea77de, querying the witness servers for the Master.

Same Client pointed to a single Sentinel works as expected.

Thanks

enricogior commented 8 years ago

@tellan55 have you tried using redis-cli.exe to send the command? Does it cause the crash as well? Thank you.

enricogior commented 8 years ago

@tellan55 do you still have this issue? I've tried to repro the problem issuing the SentinelGetMasterAddressByName() command that returns successfully and doesn't cause the sentinel to crash while monitoring three servers.

gorda81 commented 8 years ago

Hi, i have the same problem, but cant reproduse it at any given moment. Sentinel failing with outofmemory at same call stack as at first post randomly after 2-3 days of correct working. I think it may be caused by some kind of memory fragmentation? how i can get binaries with memory configuration for Sentinel?

tellan55 commented 8 years ago

Hi

Sent from my iPad

On Nov 19, 2015, at 4:03 PM, Enrico Giordani notifications@github.com wrote:

@tellan55 do you still have this issue? I've tried to repro the problem issuing the SentinelGetMasterAddressByName() command that returns successfully and doesn't cause the sentinel to crash while monitoring three servers.

— Reply to this email directly or view it on GitHub.

tellan55 commented 8 years ago

Hi @enricogior

I am still using for Sentinel the private build you posted here before.

I can switch to the RC2 when available and report back if you would like, since I have problems with the RC1 for Master/Slave connection using "authorization" enabled.

Please, let me know if you would like to try this configuration.

Thanks

enricogior commented 8 years ago

@tellan55 @gorda81 just checking if the problem is still present in the recent builds, if yes can you please report which Redis version are you currently using? Thank you.

tellan55 commented 8 years ago

@enricogior

Hi Enricogior

It is working very well with the latest release, I have up to 5 Active-Passive clusters using a single couple of witness sentinel servers. Validation worked great.

Hope this help.

Thanks

enricogior commented 8 years ago

Hi @tellan55 thank you very much for the feedback.