Closed zhangruiskyline closed 6 years ago
Hello Rui,
I'm not familiar with Azure's redis cluster support, but reading this page: https://docs.microsoft.com/en-us/azure/redis-cache/cache-how-to-premium-clustering , it looks like it should work. They do mention using a redis client with cluster support, so that means it should expose the necessary commands to query a cluster. I'm a bit concerned by the Can I directly connect to the individual shards of my cache?
question in the FAQ, where they say this is not officially supported, because that's basically what a redis cluster client must do (e.g. to follow MOVE
replies, direct the commands to the right node, etc.).
If you do try it out, I'd be very interested in hearing about your experience, what worked, what didn't and how I could update the package to better support azure!
Thanks, Martin
Thanks Martin @mna
I have tried and it seems working. although connect to individual shard is not officially support, but I can check the redis node's IP via cluster nodes
and get back information.
so in cluster setup, I use all internal nodes' address in "StartupNodes
" and external URL for dial. One follow up questions, Azure Redis support master/slave mode and master and slave will have different ports, should I put master and slave in StartupNodes
or only need to put master node address/port is enough?
Also Dose this redisc follow MOVE
command and redirect?
Thanks Rui
Hello,
You can put either masters or replicas in StartupNodes
, the client will automatically figure out the topology based on cluster nodes
responses. I'm not sure what you're doing is correct for Dial
, though. If I understand correctly, you use a redigo.DialNetDial
option to specify a dial function that ignores the address it receives as parameter and always uses the external URL instead?
If so, I believe that would defeat the goal of using redisc. The dial function should always try to connect to the address provided, that's the one the client identified as the right address to connect to the required node.
Regarding your last question, to follow MOVE
replies, redisc gives you complete control - you can either use a basic connection that will return a RedirError
for MOVE
replies, or you can wrap that basic connection in a RetryConn
that will automatically follow those redirections (see https://godoc.org/github.com/mna/redisc#RetryConn).
Hope this helps! Martin
Hi Martin @mna
regarding "The dial function should always try to connect to the address provided", you mean it will automatically fetch {IP:port} from StartupNodes
and connect?
I define a CreatePool func which will be put like
cluster := &redisc.Cluster{
StartupNodes: redisClusterStartupNode,
DialOptions: redisDialOptions,
CreatePool: createPool,
}
func createPool(addr string, opts ...redis.DialOption) (*redis.Pool, error) {
return &redis.Pool{
MaxActive: redisMaxActive,
MaxIdle: redisMaxIdle,
IdleTimeout: time.Duration(redisIdleTimeout) * time.Second,
Dial: func() (redis.Conn, error) {
c, err := redis.Dial("tcp", redisURL, opts...)
if err != nil {
log.Errorf(err.Error())
return nil, err
}
return c, nil
},
}, nil
}
but I indeed find can not pass the into createPool
so use some hack solution to directly use redisURL
in createPool
, is that not what we should do? what do you suggest the best way to use this lib for cluster connection purpose
Thanks Rui
Ok, thanks for the details, the Dial
function in createPool
should try to connect to the addr
parameter passed to createPool
, so that it returns connections connected to the right node.
@mna ,in my case, The Addr
should be load balancer URL for Azure redis, not startupnodes
IP addresses for real redis node?
Thanks Rui
@mna , let me refine my question in this way. what is the difference between StartupNodes
in Cluster data structure, and addr
in dail function of a connection pool? should they be same? or different?
Thanks Rui
Okay so here's how the redis cluster works. The startup nodes are used to build the topology of the cluster, figure out the masters and replicas. This may change during the execution of your app - e.g. maybe a node will go down, another one will replace it, etc. So that's why those are called "startup" nodes, those are the nodes to use when the app (server, typically) starts, but after that it may change. The client stays up-to-date with those changes.
Another thing that the client does is keeping track of which node in the cluster serves which key hash. There are 16384 hash values, and each hash value has a corresponding node. When you get a connection from the cluster and you execute a command, the client will identify which node should be contacted to execute this command based on the key passed to the command (e.g. GET a
will contact the node that serves the hash of key "a").
Now, about the addr
that is passed to the Cluster.CreatePool
function. When you try to get the value of key "a", the Cluster
will try to get a connection from the corresponding node's pool. If there is no pool yet for this node, it will call the Cluster.CreatePool
function to create a pool that manages connections to that specific node. So as you can see, it is important that the Dial
function of this pool really returns connection to that specific node, not the generic load balancer URL, otherwise you lose a lot of the optimization of the redis cluster client (the "smart routing" part).
In a stable cluster - typically, the huge majority of the time - the client will always select the right node to contact, so you won't receive MOVE
replies and will avoid having to do multiple network calls to get the response from the correct node.
In other words (in code), you should do this:
func createPool(addr string, opts ...redis.DialOption) (*redis.Pool, error) {
return &redis.Pool{
// other fields omitted...
Dial: func() (redis.Conn, error) {
c, err := redis.Dial("tcp", addr, opts...) // <<-- IMPORTANT! use "addr" here!
if err != nil {
log.Errorf(err.Error())
return nil, err
}
return c, nil
},
}, nil
}
Hope that makes it clearer! Martin
Thanks @mna for your detailed information.
follow up on " it is important that the Dial function of this pool really returns connection to that specific node, not the generic load balancer URL". for cluster case, I will have bunch of different {IP:ports}
of different redis nodes, which one should I use to pass into Addr
field?
Right now, in Azure redis, In createPool
I always Dial in external URL(which is actually a load balancer in Azure redis case), and in StartupNodes
in list all {IP:port}
s(I got theseIP:port
by issuing cluster nodes
in redis command line). And looks like it worked
but according to your suggestion, this sounds to be a Hack. not recommended. But I can not find another way, since Azure want to hide all cluster information and only expose one external URL to outside, and let redis client to manage all cluster/pool itself.
Can you provide a sample code what is your recommend way to implement the cluster init stage?
Thanks Rui
Hm I think I see where the confusion comes from, you never call createCuster
directly yourself, redisc will call it when needed, with the right address as parameter. You can take a look at example_test.go
in the redisc
package for some example usage.
In your code, you simply call Custer.Get()
to retrieve a pooled connection, e.g.:
// create the cluster
cluster := redisc.Cluster{
StartupNodes: ..., // your startup nodes
DialOptions: ..., // whatever dial options you want to use
CreatePool: createPool,
}
defer cluster.Close()
// initialize its mapping before use, recommended
if err := cluster.Refresh(); err != nil {
// handle error
}
// And then elsewhere, when you need a connection
// grab a connection from the pool
conn := cluster.Get()
defer conn.Close()
So you never actually call createPool
.
Hope this helps! Martin
Thanks @mna , If redisc calls the createPool
instead of myself handling it, then where should I pass the redis URL at starting point? Another question, is it possible for redisc to handle all internal nodes IP:port without specifying them in startupNodes
section(this seems to be what Azure redis recommend, so application should not know internal redis node's IP)
Thanks Rui
FYI, @mna , I have contacted Azure redis dev team,
and their response is like:
Most Redis clients that understand clustering do the following:
Another dev points out:
Normally, you can just put redis URL as startupNode. But you should be aware that our redis URL is actually a load balancer, rather than a real node. Whether this will cause issues depend on concrete implementation. As I know, there is one issue caused by passing load balancer as startup node in Java Lettuce client. You can refer this. https://github.com/lettuce-io/lettuce-core/issues/712
So right now it seems to me only the hack way works in Azure redis
createPool
addr field {ip:port}
, then assign these {ip:port}
into StartupNodes
, I am not sure whether this is best what I can do with redisc, or some other piece I should improve. please let me know your comments, that will be great helpful
Thanks Rui
Hello,
Yep, that's pretty much what a redis cluster client will do. It's a good recommendation to pass the redis URL as startup node, as it probably ensures that it can connect to a valid node in the cluster, and then redisc will discover the actual addresses of each active node.
However, in createPool
, as I said, you have to pass the addr
argument you received, as-is, not the redis URL. That's for the point 3 that the azure devs told you about:
When getting or setting a key from redis, the client will calculate the hash of the key, then lookup which node owns the slot that hash, then send the request to the specific node for that key.
So the createPool
has to create a connection to the specific node that it requested in the addr
parameter.
Martin
Thanks for reply @mna , but it dose not work in my case, I always get "redisc: failed to get a connection",here is my connection type
cluster := &redisc.Cluster{
StartupNodes: redisClusterStartupNode, //here I pass: "xxx..redis.cache.windows.net:6380" as startup nodes
DialOptions: redisDialOptions,
CreatePool: createPool,
}
and in createPool I use what you reommended
func createPool(addr string, opts ...redis.DialOption) (*redis.Pool, error) {
return &redis.Pool{
MaxActive: redisMaxActive,
MaxIdle: redisMaxIdle,
IdleTimeout: time.Duration(redisIdleTimeout) * time.Second,
Dial: func() (redis.Conn, error) {
c, err := redis.Dial("tcp", addr, opts...)
if err != nil {
log.Errorf(err.Error())
return nil, err
}
return c, nil
},
}, nil
}
is there any way I can completely get rid of startup nodes or any internal information, just provide the external URL(Load Balancer URL) and make it work? I guess it is because redisc need to have a real redis node as startupnodes
? while I provide only load balancer?
Thanks Rui
Where is your app running from? Does it have access to the internal nodes (IP address and port, not the entrypoint URL of the load balancer)? E.g. if you try to connect from one of the internal node addresses from the server where the app runs using redis-cli
, can you successfully connect and run commands?
My guess is that your app doesn't have access to the cluster nodes' addresses from where it runs.
I am running it in my local machine, but it should have cert and SSL configured, I am able to connect to redisURL:6380
. and the hack works(put this redisURL:6380
in creatPool
addr
)
but when putting redisURL:
into startupNodes
, it dose not work, I try both 6380 or 1500x ports, neither work
Thanks Rui
Hi @mna
some more update on debug: if I only put redisURL:6380
in startupnode
, and use addr
in createPool
, seems to me the correct redis node address is passed(according to debug), but somehow dial function can not establish the connection due to SSL handle error.
I tried two different ways:
createpool
:func createPool(addr string, opts ...redis.DialOption) (*redis.Pool, error) {
return &redis.Pool{
MaxActive: redisMaxActive,
MaxIdle: redisMaxIdle,
IdleTimeout: time.Duration(redisIdleTimeout) * time.Second,
Dial: func() (redis.Conn, error) {
c, err := redis.Dial("tcp", addr, opts...)
if err != nil {
log.Errorf(err.Error())
return nil, err
}
//Auth here in Dial, which is what I did for redigo and worked
if _, err := c.Do("AUTH", redisAuth); err != nil {
log.Errorf(err.Error())
c.Close()
return nil, err
}
return c, nil
},
}, nil
}
redis.DialPassword(redisAuth)
in dial option,but both returns "redisc: failed to get a connection
", Not sure what is best way I should dial in SSL case for cluster.
It looks like to me problem is:
When redisc first establish SSL connection, it will call redisURL:6380
, which is a load balancer as indicated before. so it can establish security connections.
but later when it tries to find other nodes in redis cluster to connect, and when createpool
try to establish the SSL connection directly to new redis node, it fails. probably all security handle is on load balancer side for Azure redis, rather than each cluster node level.
Is there any solution to this?
Thanks Rui
Update again,
if I disable the Azure redis cache SSL and expose none SSL port, only use PWD to Auth, it works!! so looks like my former assumption of SSL auth problem is correct
Thanks Rui
This is handled by redigo (used by redisc to manage connections and pools). See https://godoc.org/github.com/garyburd/redigo/redis#DialTLSConfig and https://godoc.org/github.com/garyburd/redigo/redis#DialUseTLS. You should specify those DialOptions on Cluster.DialOptions. On Sat, Apr 14, 2018 at 03:52 Rui Zhang notifications@github.com wrote:
Update again,
if I disable the Azure redis cache SSL and expose none SSL port, only use Passport to Auth, it works! so looks like my former assumption of SSL auth problem is correct
Thanks Rui
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mna/redisc/issues/13#issuecomment-381311431, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqQoE9sRn4nohIiVM-m3PvFWAmJVA8cks5toarHgaJpZM4TQZqn .
Yes, I am aware of using redigo to do SSL, but the question I have is when we createpool
to some other redis node rather than Loadbalancer(redisURL:6380
), it seems redisc will try to create same SSL session pool directly to redis node, but will be rejected since only load balancer is able to handle the SSL stuff. even if I change redisTLSconfig.ServerName = redisURL:6380
in createpool, it dose not work.
any better way I can force SSL connection to load balancer instead of addr
field when I try to createpool
?
Thanks Rui
Ah I see, yeah so you don't need SSL in createPool
on the internal nodes, only for the load balancer URL... There's no obvious way to do this, one thing you could do is not set the DialTLSConfig
and DialUseTLS
on Cluster.DialOptions
, and only set it in createPool
when the addr
is the load balancer's address, e.g.:
func createPool(addr string, opts ...redis.DialOption) (*redis.Pool, error) {
if addr == redisURL { // or redisURL + ":6380", don't know exactly what you've set redisURL to
opts = append(opts, redis.DialUseTLS(true))
opts = append(opts, redis.DialTLSConfig(...))
}
return &redis.Pool{
// ...
Dial: func() (redis.Conn, error) {
c, err := redis.Dial("tcp", addr, opts...)
// ...
}
}
I think this dose not work also, when redisc dial into some other redis nodes. it still need to go to SSL connection, juts it won't SSL directly to redis node, but load balancer. in your code, it will not establish the SSL at all
I'm afraid I don't understand what is your issue then. You either need TLS and thus have it configured in the dial options, or you don't and thus you set it on/off in the dial options when needed. I'm not familiar with Azure's redis, but I'd expect the external URL to require TLS config so that the endpoint is not publicly accessible from outside the VPN, but the internal nodes to be accessed from inside the VPN (where your binary should run) without TLS.
Let me double check again. My question is when dial into some redis cluster node, we still need SSL, but not SSL directly to node, rather it needs to dial to Redis load balancer URL. In your former approach, it will directly dial into redis node without SSL, right?
Hi @mna
I think I got some feedback from Azure redis team, so client can directly connect to redis nodes(not necessarily load Balancer), but in Azure redis, different nodes and load balancer shares same IP, but different ports.(via a NAT way)
but in dial, seems we split the ports, so we lost port in SSL creation? anyway, it could be a question for redigo.
if tlsConfig.ServerName == "" {
host, _, err := net.SplitHostPort(address)
if err != nil {
netConn.Close()
return nil, err
}
tlsConfig.ServerName = host
}
Thanks Rui
I think the problem is that Azure redis uses NAT in cluster mode, so every node has same IP but different ports, somehow redisc can not bind rc, _, err := c.bind(cmdSlot(cmd, args))
, find some strange slot and can not find correct node.
investigating more, but if you have some insight, please share
Thanks Rui
One question, the createPool
will be called only in startup stage? basically if I only provide load balancer node address as startupNodes
option, will it run "cluster nodes
" command in redis cache and get back all nodes and create SSLs for all of them at starting point?
Thanks Rui
No, createPool
is created whenever a pool is needed for one node (i.e. at least once per node in the cluster + startup nodes).
@mna
I got some feedback again from Azure redis team about the recommendation how the cluster client should work.
Quote as:
I am not sure whether redisc did same? If not, how can I use redisc to minic the Azure redis behavior?
Thanks Rui
OK, finally I got the client works in SSL mode. here is configuration
in StartupNodes
, only put the redisURL:6380(Loadbalancer URL)
, in dial option, use redis.DialUseTLS(true),
and redis.DialTLSSkipVerify(true)
, The key point put redis.DialTLSSkipVerify(true)
, I am still not 100% sure why put this works.
Thanks Rui
Ok, looks like you don't have a trusted certificate. Glad you got it working, closing as this is a connection configuration issue.
Hi
I am using Microsoft Azure redis and it dose not provide each cluster node individual IP address, rather it will only provide the redis cluster URL(which is actually a load balancer, not real redis nodes). Can this lib handle this? if so, I should put this redis URL in startupNodes or in dialoption's address?
Thanks Rui