tangcong / codis-operator

Codis Operator creates and manages codis clusters(proxy based Redis cluster solution) running in kubernetes.(WIP)
Apache License 2.0
22 stars 4 forks source link

Specifying loadbalancer for proxy service #18

Open oruchreis opened 5 years ago

oruchreis commented 5 years ago

Hi, I'm using default kubernetes deployment scripts from codis repo with small changes on azure aks on production. But I was faced some situations when the proxy or the server has fallen. So I started to search another solution for codis on k8s. I've tried your project, but as far as I saw, it install the services as internal for the cluster. I need LoadBalancer for proxy and fe services, and also I need to add annotations for specifying loadbalancer for internal network use, so azure aks sets this service a static internal ip address. How can I specfy LoadBalancer type for the services and add annotations to the services? Also has there any issue to prevent us to use this on production environments?

tangcong commented 5 years ago

thanks for your attention and good advice. I have opened an issue(#19) and will Complete tomorrow at the latest. warning: currently,codis operator is work in progress [WIP] and is NOT ready for production. use at your own risk. you can try it in your test environment.

tangcong commented 5 years ago

best practices:

tangcong commented 5 years ago

there are some issues remained to be solved:

@oruchreis

tangcong commented 5 years ago

ea5958402ab29e640c95ba924fa8544c96610358 done,you can take a try~ @oruchreis example: https://github.com/tangcong/codis-operator/blob/master/examples/sample-3.yml

oruchreis commented 5 years ago

Thanks a lot, I'll try it and notify the result here before long.

oruchreis commented 5 years ago

Hi, I tried this on a fresh install kubernetes. First I tried without rbac, but codis-fe displayed proxies at timeout state, and there were no server or sentinel. Then I tried with rbac, but it failed again. Also kubernetes dashboard displays pods as healthy. By the way, the serviceAnnotations worked as expected, I could set public and internal ips to the load balancer flawlessly. I've attached codis-operator logs. logs-from-codis-operator-in-codis-operator-0.txt

tangcong commented 5 years ago

How to reproduce it (as minimally and precisely as possible)? what is your kubernetes version? can you provide codis-proxy logs and codis-fe snapshot?

oruchreis commented 5 years ago

Kubernetes version is 1.12.4. Here is the yaml that I've used which is cloned from sample3: codis-operator.txt Here are the logs from proxy and dashboard. logs.zip I don't know how to get snapshot of codis-fe. Codis-fe has any logs but these: 2019/02/12 08:05:09 main.go:101: [WARN] set ncpu = 2 2019/02/12 08:05:09 main.go:104: [WARN] set listen = 10.90.44.166:9090 2019/02/12 08:05:09 main.go:120: [WARN] set assets = /gopath/src/github.com/CodisLabs/codis/bin/assets 2019/02/12 08:05:09 main.go:162: [WARN] set --etcd = etcd-client:2379

tangcong commented 5 years ago
[error]: Get http://web-codis-dashboard.default.svc.cluster.local:18080/api/topom/model: dial tcp: lookup web-codis-dashboard.default.svc.cluster.local: no such host
    4   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:134
            github.com/CodisLabs/codis/pkg/utils/rpc.apiRequestJson
    3   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:169
            github.com/CodisLabs/codis/pkg/utils/rpc.ApiGetJson
    2   /gopath/src/github.com/CodisLabs/codis/pkg/topom/topom_api.go:787
            github.com/CodisLabs/codis/pkg/topom.(*ApiClient).Model
    1   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:329
            main.OnlineProxy
    0   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:289
            main.AutoOnlineWithDashboard

it seems like that codis-proxy can not connect to dashboard and failed to resolve dashboard dns(web-codis-dashboard.default.svc.cluster.local). is the k8s dns service working properly? i have only tested it in k8s 1.10.

oruchreis commented 5 years ago

I can connect with curl to this url inside the proxy. I've removed problematic proxies, now only one proxy seems to connected to the dashboard. But there isn't any server or sentinel displayed in the codis-fe. Also I recreated server and sentinel pods but it didn't worked neither. Also I've checked etcd with etcd-browser, it shows only one proxy, no group or server or sentinel. By the way I can create groups and add ip addresses of servers manually.

tangcong commented 5 years ago
ERROR: logging before flag.Parse: I0212 06:58:46.134697       1 dashboard.go:59] Successful Create,create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.134740       1 dashboard.go:190] deploy codis dashboard image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.135471       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216557       1 dashboard.go:77] Successful Create,create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216627       1 codiscluster_controller.go:167] reconcile dashboard succ
ERROR: logging before flag.Parse: I0212 06:58:46.217034       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300882       1 proxy.go:72] Successful Create,create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300924       1 proxy.go:284] deploy codis proxy image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.302047       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330207       1 proxy.go:90] Successful Create,create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330360       1 proxy.go:422] codis proxy hpa:{1 3 10}
ERROR: logging before flag.Parse: I0212 06:58:46.331222       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538063       1 proxy.go:108] Successful Create,create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538083       1 codiscluster_controller.go:173] reconcile proxy succ
ERROR: logging before flag.Parse: I0212 06:58:46.538289       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628963       1 fe.go:58] Successful Create,create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628992       1 fe.go:212] deploy codis fe image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.629463       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659217       1 fe.go:76] Successful Create,create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659236       1 codiscluster_controller.go:179] reconcile fe succ
ERROR: logging before flag.Parse: I0212 06:58:46.659890       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693087       1 redis.go:62] Successful Create,create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693128       1 redis.go:224] deploy codis-server image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.693440       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729154       1 redis.go:80] Successful Create,create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729206       1 codiscluster_controller.go:185] reconcile redis succ
ERROR: logging before flag.Parse: I0212 06:58:46.729652       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752893       1 sentinel.go:62] Successful Create,create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752932       1 sentinel.go:220] deploy redis-sentinel image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.753208       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.776349       1 sentinel.go:80] Successful Create,create StatefulSet web-codis-redis-sentinel in CodisCluster web-codis successful

codis-operator log shows that every component create successfully at that time. it is strange that i do not find any error message. every component pods are running? maybe,i have got it, now, you have to create group and add redis/sentinel instance into your cluster manually. later, i will add component into codis-fe automaticly.

oruchreis commented 5 years ago

Hımm, then it is my bad. I supposed to see every component onto codis-fe automatically which the k8s yaml scripts at codis repo does that. If I add every component and create groups, then if I scale up or down, will I add new group or create new groups?

tangcong commented 5 years ago

yes~ i will add component into codis-fe automaticly as soon as possible.( i am busy recently)

ZhangSIming-blyq commented 2 years ago

In my condition, my codis-operator error is "codis-dashboard.codis-operator.svc.cluster.local: no such host", and the error reason is my cluster can not resolve xxx.xxx.svc.cluster.local, it can handle with something like "xxx.xxx.svc.[mycluster].local" instead. Is there a configuration part to change it?

ZhangSIming-blyq commented 2 years ago

image Hard-coded

tangcong commented 2 years ago

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

ZhangSIming-blyq commented 2 years ago

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

ok, thanks anyway