kubernetes / cloud-provider-openstack

Apache License 2.0
619 stars 611 forks source link

[openstack-cloud-controller-manager] Support Octavia Availability Zone #1223

Closed ITD27M01 closed 4 years ago

ITD27M01 commented 4 years ago

/kind feature

What happened: The Ussuri release of Octavia now supports the availability zone feature where operators can specify some useful capabilities such as the compute availability zone for the amphora instances. For large distributed clouds (enterprise and public), this is a must-have feature that not only improves reliability but also allows operators to use their hardware efficiently. Effectively and according to the tasks. Currently, we have to create a separate zone only for load balancers (with one huge lb-mng network) and all cloud projects use such a hardware pool which sometimes may experience conflicts and resources overflow. This also leads to one point of failure.

What you expected to happen: I want to suggest to add the "availability-zone" parameters which will be used in Octavia LB request [1]

{
    "loadbalancer": {
        "name": "best_load_balancer",
        "provider": "amphora",
        "availability_zone": "my_az",
        "vip_subnet_id": "d4af86e1-0051-488c-b7a0-527f97490c9a",
    }
}

The same logic as for flavor-id: [2]

It looks like gophercloud doesnt have such an attribute [3] [4]

[1] https://docs.openstack.org/api-ref/load-balancer/v2/?expanded=create-a-load-balancer-detail#create-a-load-balancer [2] https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/cloudprovider/providers/openstack/openstack_loadbalancer.go#L474 [3] https://github.com/gophercloud/gophercloud/blob/master/openstack/loadbalancer/v2/loadbalancers/requests.go [4] https://github.com/gophercloud/gophercloud/issues/2021 Environment:

jichenjc commented 4 years ago

this might be a general question , the feature is added at U release and does Octavia has microversion? otherwise, how to support version before U ? e.g we used 2.60 version API here https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/csi/cinder/openstack/openstack_volumes.go#L170

https://docs.openstack.org/api-ref/load-balancer/v2/?expanded=create-a-load-balancer-detail#create-a-load-balancer indicated following but not indicated it's a new param in API side...


availability_zone (Optional) | body | object | An availability zone name.
-- | -- | -- | --
ITD27M01 commented 4 years ago

Hello @jichenjc ,

I have little experience in contributing to Kuryr-Kubernetes (native OpenStack Neutron-based networking in Kubernetes). They are widely used the Octavia as a basement for Kubernetes services and one example is how they check is native Octavia ACL is supported or not in the deployed version [1].

We need one more call to root [2] to understand our current abilities and do the staff based on version number:

        if v >= _OCTAVIA_ACL_VERSION:
            self._octavia_acls = True
            LOG.info('Octavia supports ACLs for Amphora provider.')
        if v >= _OCTAVIA_DL_VERSION:
            self._octavia_double_listeners = True
            LOG.info('Octavia supports double listeners (different '
                     'protocol, same port) for Amphora provider.')
        if v >= _OCTAVIA_TAGGING_VERSION:
            LOG.info('Octavia supports resource tags.')
            self._octavia_tags = True
        else:
            v_str = '%d.%d' % v
            LOG.warning('[neutron_defaults]resource_tags is set, but Octavia '
                        'API %s does not support resource tagging. Kuryr '
                        'will put requested tags in the description field of '
                        'Octavia resources.', v_str)

There is an json response for Train:

{
    "versions": [
        {
            "id": "v2.0",
            "status": "SUPPORTED",
            "updated": "2016-12-11T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.1",
            "status": "SUPPORTED",
            "updated": "2018-04-20T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.2",
            "status": "SUPPORTED",
            "updated": "2018-07-31T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.3",
            "status": "SUPPORTED",
            "updated": "2018-12-18T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.4",
            "status": "SUPPORTED",
            "updated": "2018-12-19T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.5",
            "status": "SUPPORTED",
            "updated": "2019-01-21T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.6",
            "status": "SUPPORTED",
            "updated": "2019-01-25T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.7",
            "status": "SUPPORTED",
            "updated": "2018-01-25T12:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.8",
            "status": "SUPPORTED",
            "updated": "2019-02-12T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.9",
            "status": "SUPPORTED",
            "updated": "2019-03-04T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.10",
            "status": "SUPPORTED",
            "updated": "2019-03-05T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.11",
            "status": "SUPPORTED",
            "updated": "2019-06-24T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.12",
            "status": "SUPPORTED",
            "updated": "2019-09-11T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        },
        {
            "id": "v2.13",
            "status": "CURRENT",
            "updated": "2019-09-13T00:00:00Z",
            "links": [
                {
                    "href": "https://octavia-public-testos.example.com:9876/v2",
                    "rel": "self"
                }
            ]
        }
    ]
}

[1] https://github.com/openstack/kuryr-kubernetes/blob/master/kuryr_kubernetes/controller/drivers/lbaasv2.py#L49 [2] https://github.com/openstack/kuryr-kubernetes/blob/master/kuryr_kubernetes/controller/drivers/lbaasv2.py#L66

jichenjc commented 4 years ago

@ITD27M01 we should specify version in OCCM if we have to... but the way you provided seems not used in OCCM ..only version check is used as you mentioned in gophercloud issue, I guess gophercloud is a good place if we have to add

  if v >= _OCTAVIA_ACL_VERSION:
            self._octavia_acls = True
            LOG.info('Octavia supports ACLs for Amphora provider.')
        if v >= _OCTAVIA_DL_VERSION:
            self._octavia_double_listeners = True
            LOG.info('Octavia supports double listeners (different '
                     'protocol, same port) for Amphora provider.')
        if v >= _OCTAVIA_TAGGING_VERSION:
            LOG.info('Octavia supports resource tags.')
            self._octavia_tags = True
ITD27M01 commented 4 years ago

@jichenjc After a little research, I realized that the kuryr-kubernetes way is not suitable for this. They have used the new features for speedup/make it more convenient things but there are no things about backward compatibility.

I suggest failing if there is no availability zone support on the API side with a readable error. It will simplify the code and helps the operators during the cluster configuration.

I'll copy the microversion information for convenience:

The client can specify the minor version it wants to use:

openstack server migrate --live-migration --host srv-os-kvm01 --os-compute-api-version 2.99 63a5d8da-1f8d-495b-9bc2-b8bbb019c829 --debug

The first check will occur on client-side:

  File "/Users/igor.tiunov/anaconda3/envs/openstack/lib/python3.8/site-packages/openstackclient/compute/client.py", line 152, in check_api_version
    raise exceptions.CommandError(msg)
osc_lib.exceptions.CommandError: versions supported by client: 2.1 - 2.87

And the upper version is checked on server side:

openstack server migrate --live-migration --host srv-os-kvm01 --os-compute-api-version 2.87 63a5d8da-1f8d-495b-9bc2-b8bbb019c829 --debug

RESP BODY: {"computeFault": {"code": 406, "message": "Version 2.87 is not supported by the API. Minimum is 2.1 and maximum is 2.79."}}

So, when you specify the 2.60 version API here https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/csi/cinder/openstack/openstack_volumes.go#L170

you don't add the backward compatibility (client or OCCM will fail in the case the API doesn't support feature), there is just a convenient way to handling the corresponding error. The backward compatibility is managed by service (Nova/Cinder, etc...) server-side - they specify default minor version and require the client to raise this default version if it requires new features: https://github.com/openstack/nova/blob/master/nova/api/openstack/api_version_request.py#L248

jichenjc commented 4 years ago

I suggest failing if there is no availability zone support on the API side with a readable error.

It will simplify the code and helps the operators during the cluster configuration. I agree this is a better way to handle ,and it's the way micorversion version should work

you don't add the backward compatibility (client or OCCM will fail in the case the API doesn't support feature), there is just a convenient way to handling the corresponding error.

I agree we should try v2.60 first, then if it's not supported, either report to ask user increase version (e.g the openstack is too old) or ignore the error but not support the feature

ITD27M01 commented 4 years ago

ignore the error but not support the feature

I think it is the worst solution because the user (a cluster operator) will expect some behavior but don't get it because OCCM silently decides on their own what to do. If you check, you will not find such usage of microversion. The primary goal of it is to maintain backward compatibility on the server-side, not on the client-side. In this way:

The reason why the microversion capability is used on this is because the Nova API server uses an "old" API-version by default (2.1) and you are forced to raise it to 2.6 for multiattached volumes.

So, I think that we don't "try v2.60 first", but forced to request capabilities and features from v2.60. And there are two paths in condition:

  1. The Nova server switches from the old mode (2.1) to the new one (2.6).
  2. The Nova server report 406 error.

So, the code to check the currently supported version won't help us much, because OCCM should fail in ether case (microversion request error or, if microversions are not used as for Octavia, the availability_zone param is not supported) with an error and report it to the user that availability zones not supported.

jichenjc commented 4 years ago

the code to check the currently supported version won't help us much, because OCCM should fail in ether case (microversion request error or, if microversions are not used as for Octavia, the availability_zone param is not supported) with an error and report it to the user that availability zones not supported.

ok, I think it's reasonable, if we can't achieve what we want customer want to do, give explicit error in this case .. thanks for the detailed description above

lingxiankong commented 4 years ago

@jichenjc Just FYI, OCCM supports to check Octavia version, please check https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/util/openstack/loadbalancer.go#L90

ITD27M01 commented 4 years ago

@lingxiankong Thank you for pointing it out. Half of work already done. I am confused by the fact that the parameters are simply ignored if the feature is not supported. But I will be satisfied with the warning message:

    if openstackutil.IsOctaviaFeatureSupported(lbaas.lb, openstackutil.OctaviaFeatureVIPACL) {
        klog.V(4).Info("LoadBalancerSourceRanges is suppported")
        listenerAllowedCIDRs = sourceRanges.StringSlice()
    } else {
        klog.Warning("LoadBalancerSourceRanges is ignored")
    }