kubernetes-sigs / cluster-api-provider-openstack

Cluster API implementation for OpenStack
https://cluster-api-openstack.sigs.k8s.io/
Apache License 2.0
286 stars 252 forks source link

missing SecurityGroup on neutron API-Loadbalancer #694

Closed teutonet closed 3 years ago

teutonet commented 3 years ago

/kind bug

Hi.

What steps did you take and what happened: I deployed an an k8s-cluster via the openstack-api-provider and external-cloud-provider with the following configuration:

  managedAPIServerLoadBalancer: true
  managedSecurityGroups: true
  nodeCidr: 10.6.0.0/24
  useOctavia: false

The loadblanacer and corresponding pools, members, monitors and listeners were created and the first control-plane-node starts and bootstraps successfully. But my kind based bootstrap-cluster can't reach the api of the newly created node. After updating the default-security-group and adding port 6443/tcp the capo-provider was able to proceed and fetch its information about the cluster-status.

What did you expect to happen: I expect to get access to api via loadbalancer directly, without doing manual steps and adding SecurityGroupRules to the default-group.

Anything else you would like to add: For me it seems to be ok if the control-plane-securitygroup will be assigned to the loadbalancer-port, or a dedicated securitygroup which allows access to the api-port and maybe apiServerLoadBalancerAdditionalPorts. Additionally the default-group can be removed from the loadbalancer-port.

Or is the usage of neutron-lbaas obsolet and i should use octavia? Is there a workaround for that problem?

Environment:

Thanks for the great work.

Regards Tino

hidekazuna commented 3 years ago

@TeutoNet I am personally using Octavia, have never used useOctavia: false. Which OpenStack version do you use? Because Neutron LBaaS has now been retired in Train release cycle:https://docs.openstack.org/releasenotes/neutron/train.html#relnotes-15-0-0-stable-train-deprecation-notes

jichenjc commented 3 years ago

@hidekazuna what's the difference between use Neutron LBaaSv2 and Octavia? if it's Neutron LBaaSv2 only issue ,then maybe we can suggest to move to Octavia and if it's a just a minor change to capi code, maybe we can consider take it for users who is still using LBaaSv2 a chance so it's still functional and more time to migrate to Octavia.. your suggestion?

hidekazuna commented 3 years ago

@jichenjc I tested external-cloud-provider with Octavia. it works fine; creating deployment,pod, and LoadBalancer type of Service, except for #691 .

hidekazuna commented 3 years ago

@TeutoNet One more question to clarify the issue. The security group k8s-cluster-default-cluster name-secgroup-controlplane was created, but it did not have the rule 6443/tcp, right? If so, no rule was added to the group? Can you provide the log of kubectl logs -l control-plane=capo-controller-manager -c manager -n capo-system?

teutonet commented 3 years ago

@TeutoNet I am personally using Octavia, have never used useOctavia: false. Which OpenStack version do you use? Because Neutron LBaaS has now been retired in Train release cycle:https://docs.openstack.org/releasenotes/neutron/train.html#relnotes-15-0-0-stable-train-deprecation-notes

Hi,

we are using openstack queens, but we are able to enable octavia. But octavia uses a lot more resources per loadblancer and i don't know excactly, but if we enable octavia per project and the clusters uses the cloud-provider, one octavia will be createt per service, except if we are use https://github.com/kubernetes/cloud-provider-openstack/blob/master//docs/octavia-ingress-controller/using-octavia-ingress-controller.md

But if it needed we can do that.

teutonet commented 3 years ago

@hidekazuna what's the difference between use Neutron LBaaSv2 and Octavia? if it's Neutron LBaaSv2 only issue ,then maybe we can suggest to move to Octavia and if it's a just a minor change to capi code, maybe we can consider take it for users who is still using LBaaSv2 a chance so it's still functional and more time to migrate to Octavia.. your suggestion?

I just had a quick look to the code and i can't estimate how many changes are needed ;) . I can give octavia a try and look into it. But the usage of octavia has a much bigger footprint instead of using a haproxy in an namespace. Additionally there is the configuration and mgmt-overhead when using octavia (certificate-handling and so on)

I also tried to deploy the clusters without using the api-loadbalancer. The Floating-IP was transfered between the control-nodes. Is this an ha-behaviour by capi/capo or this only the case while deploying the cluster?

teutonet commented 3 years ago

@TeutoNet One more question to clarify the issue. The security group k8s-cluster-default-cluster name-secgroup-controlplane was created, but it did not have the rule 6443/tcp, right? If so, no rule was added to the group? Can you provide the log of kubectl logs -l control-plane=capo-controller-manager -c manager -n capo-system?

No, all needed groups were created, but the internal openstack-port of the load-balancer has only the default-security-group and therefor the access of port 6443 is denied from inside and outside, because no one is in that group and if we come from outside, e.g. from the mgmt-cluster access is denied in any case.

hidekazuna commented 3 years ago

@TeutoNet I am not sure OpenStack Queens Octavia works nice. But yes, you can try.

I also tried to deploy the clusters without using the api-loadbalancer. The Floating-IP was transfered between the control-nodes. Is this an ha-behaviour by capi/capo or this only the case while deploying the cluster?

Multiple controller nodes without using LoadBalancer for API Server is not supported. If you want to deploy the cluster without using LoadBalancer, use non-HA template which is single controller, please.

@TeutoNet One more question to clarify the issue. The security group k8s-cluster-default-cluster name-secgroup-controlplane was created, but it did not have the rule 6443/tcp, right? If so, no rule was added to the group? Can you provide the log of kubectl logs -l control-plane=capo-controller-manager -c manager -n capo-system?

No, all needed groups were created, but the internal openstack-port of the load-balancer has only the default-security-group and therefor the access of port 6443 is denied from inside and outside, because no one is in that group and if we come from outside, e.g. from the mgmt-cluster access is denied in any case.

Thanks for your comment, I studied Neutron LBaaS v2 and found big difference between Neutron LBaaS v2 and Octavia. For Neutron LBaaS v2, we need to create a security group for vip_port and add it to the port by myself. But Octavia, we do not need to create security group by myself. If we create listener, security group is created and added to the port automatically. That's why you need to create security group and rule and add it by yourself.

teutonet commented 3 years ago

Hi, thanks for the response. I will have to discuss this internally and how we will proceed. Is there a chance that you implement that, or that we can make a pull-request to get that feature in. Or, because of the neutron-lbaas-obsolescence that feature will not be integrated in any way?

Thanks for the help.

Regards Tino

jichenjc commented 3 years ago

I vote to add it and clearly document it's only useful until some version.. maybe you can help to push a PR..

hidekazuna commented 3 years ago

@TeutoNet Unfortunately I do not have Neutron LBaaS v2 working environment so that I would not create PR. I can help you to create a PR, of course. Now we know what to do;create security group and add it to vip_port if useOctavia: false.

FYI: We are discuss when we stop using Neutron LBaaS v2 in #695 .

teutonet commented 3 years ago

Hi,

we decide to switch to octavia and maybe updating our old openstack-deployments if needed.

Thanks for your support.

Regards Tino