redis / go-redis

Redis Go client
https://redis.uptrace.dev
BSD 2-Clause "Simplified" License
19.92k stars 2.34k forks source link

pubsub channel keep-alive problem #1037

Closed larrycinnabar closed 5 years ago

larrycinnabar commented 5 years ago

I'm keep getting

redis: 2019/05/30 05:51:06 pubsub.go:151: redis: discarding bad PubSub connection: redis: ping timeout
redis: 2019/05/30 05:51:11 pubsub.go:151: redis: discarding bad PubSub connection: redis: ping timeout

I think problem is in initChannel function

If pool is closed then it just completely close the channel and exits from go routine.

        for {
            msg, err := c.Receive()
            if err != nil {
                if err == pool.ErrClosed {
                    close(c.ch)
                    return
                }

And then the second go-routine (health) will always do ping timeout. The reconnect idea doesn't work because first goroutine (the main read one) is not running anymore

vmihailenco commented 5 years ago

Please a program that reproduces the problem.

larrycinnabar commented 5 years ago

Very simple program that just resend items to other channel (for having data in our own model type):

        redisChan := rc.PSubscribe(channels...).Channel()
        go func() {
            for redisMsg := range redisChan {
                m := models.RedisMessage{
                    Channel: redisMsg.Channel,
                    Pattern: redisMsg.Pattern,
                    Payload: redisMsg.Payload,
                }

                ch <- m
            }
        }()
larrycinnabar commented 5 years ago

Any suggestions?

inside initChannel you have 2 goroutines: the main once - consumer, and the second one - health-checker.

So, when pool is closed and we get pool.errClosed on attempt to consume a message, we just exit the goroutine. It's a mode when reconnect want help because the only consuming goroutine will not be run again ever, until you restart the application

vmihailenco commented 5 years ago

I see that it should exit here. You still did not provide the program that reproduces the problem.

larrycinnabar commented 5 years ago

@vmihailenco yeap, it should. But it returns here

I suppose you need more detailed way of my problem (that I posted above). With redis connection, etc.

Ok:

client := redis.NewClient(&redis.Options{
        Addr:               node,
        ReadTimeout:        250 * time.Millisecond,
        WriteTimeout:       250 * time.Millisecond,
        DialTimeout:        c.options.DialTimeout,
        PoolTimeout:        5 * time.Second,
        IdleTimeout:        3 * time.Minute,
        PoolSize:           30 * runtime.NumCPU(),
    })

var pingDone bool
if pingResult, pingErr := client.Ping().Result(); pingErr == nil {
    pingDone = pingResult == "PONG"
}

log.Println("Ping Done: %t", pingDone)

expired := make(ch models.RedisMessage)
go func() {
    redisChan := client.PSubscribe("__keyevent@0__:expired").Channel()
    for redisMsg := range redisChan {
        m := models.RedisMessage{
            Channel: redisMsg.Channel,
            Pattern: redisMsg.Pattern,
            Payload: redisMsg.Payload,
        }

        expired <- m
    }
}()

// somewhere in other part of application
go func(){
    for msg := range expired {
        log.Printf("debug: expired %v", msg)
    }
})()

It's working for 5-10 hours and then the message described in topic above appears

larrycinnabar commented 5 years ago

any ideas, suggestions, pieces of advices? @vmihailenco

vmihailenco commented 5 years ago

Make sure you are using latest version. If that does not help - try to isolate the problem. Most likely it is a problem with your code.

larrycinnabar commented 5 years ago

@vmihailenco

Ok, I'll try to isolate problem, but can you please help me with these questions:

1) is pool.ErrClosed a crucial thing that should not be recovered from? If yes, then our problem is with getting ErrClosed but not with pubSub channels

2) if pool.ErrClosed is a possible thing, and pubSub channel should be recovered, then how? here it just returns from the main go-routine, so it will never call c.Receive() anymore.

Main question is in the fact if pool.ErrClosed is an error that pub-sub can recover from (and handle keep-alived channel), or not?

vmihailenco commented 5 years ago

pool.ErrClosed means that client is closed via Client.Close(). All connections are closed and PubSub stops receving messages and closes the Go channel.

larrycinnabar commented 5 years ago

so, it means that our problem is not the pub-sub part, but just the fact that client is closed

larrycinnabar commented 5 years ago

@vmihailenco

Is there a way to get some kind of notification (via channel or something) that redis library got pool.ErrClosed?

So then, we can in our application just listen to it, and fully re-establish redis connection if needed

vmihailenco commented 5 years ago

Client is closed by your app and you have full control of this process. I don't see how go-redis can help you here.