kubernetes-sigs / cluster-api-provider-openstack

Cluster API implementation for OpenStack
https://cluster-api-openstack.sigs.k8s.io/
Apache License 2.0
297 stars 256 forks source link

Network filter not properly filtering #1387

Closed justapill closed 1 year ago

justapill commented 1 year ago

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

In my environment, we have pre-defined networks for internal traffic between the nodes; a network and subnet. Nodecidr is not defined. The error on capo-controller-manager is failing to reconcile the k8s cluster resource, it's possible this is user error as this is the first time I've used cluserAPI as a deployment method for k8s. The error prints out all of the network names that we have in our openstack project, except the one that is specified in the network filter in the following yaml.

here's the yaml I'm using for a machinetemplate,

kind: OpenStackMachineTemplate
metadata:
  name: capi-openstack-0-md-0
  namespace: default
spec:
  template:
    spec:
      cloudName: openstack
      flavor: k8s-worker
      identityRef:
        kind: Secret
        name: capi-openstack-0-cloud-config
      image: ubuntu-2004-kube-v1.23.10
      sshKeyName: k8skey
      networks:
      - filter:
          name: VLAN<redacted>
          id: <redacted>
        subnets:
        - filter:
            name: <redacted>`

Error: E1118 19:37:07.491376 1 controller.go:317] controller/openstackcluster "msg"="Reconciler │ │ error" "error"="failed to find only one network

What did you expect to happen: Cluster to reconcile with specified network filter.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] As a side note subnet_id mentioned in the docs here: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/docs/book/src/clusteropenstack/configuration.md#multiple-networks

Is returning as an invalid value.

Environment:

jichenjc commented 1 year ago

I think from code https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/controllers/openstackcluster_controller.go#L449

your filter should makes CAPO find one and only one network otherwise the error report

The error prints out all of the network names that we have in our openstack project, except the one that is specified in the network filter in the following yaml.

so you need check how many networks you are having and with filter , what's left, with that, it might helps fix the issue

justapill commented 1 year ago

I think from code https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/controllers/openstackcluster_controller.go#L449

your filter should makes CAPO find one and only one network otherwise the error report

The error prints out all of the network names that we have in our openstack project, except the one that is specified in the network filter in the following yaml.

so you need check how many networks you are having and with filter , what's left, with that, it might helps fix the issue

Is there a way to check the network filter before deploying in kind? How is CAPO filtering for the network?

I've also tried using the uuid parameter with the same results.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  networks:
  - uuid: your_network_id

(pulled from docs)

jichenjc commented 1 year ago

check above logic I mentioned, it will call GetNetworksByFilter (there are some wrapper calls inside CAPO)

and finally go to https://github.com/gophercloud/gophercloud/blob/master/openstack/networking/v2/networks/requests.go#L49

I think your uuid try is incorrect per https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/api/v1alpha6/types.go#L70

so please keep id+name combination, and from above logic, the filter will use those input (id+name) and filter network from list

justapill commented 1 year ago

I've given the filter as much info as I can,

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
  name: capi-openstack-0-md-0
  namespace: default
spec:
  template:
    spec:
      cloudName: openstack
      flavor: k8s-worker
      identityRef:
        kind: Secret
        name: capi-openstack-0-cloud-config
      image: ubuntu-2004-kube-v1.23.10
      sshKeyName: k8skey
      networks:
      - uuid: <redacted>
      - filter:
          name: VLAN<redacted>
          id: <redacted>
          projectId: <redacted>
        subnets:
        - filter:
            name: SVCS-K8S-INT
            id: <redacted>
            projectId: <redacted>
            cidr: <redacted>
            ipVersion: 4
            gateway_ip: <redacted>

Here's a snippet of the trace 2022-01-12 20:11:37 +0000 UTC 8a691630371549cdb42159d62710cadf false [] []} {<network_id> VLAN<redacted> true ACTIVE [<subnet_id>] <project_id> All of networks are listed in the trace even ones not in the specified project id. Except the network I'm filtering for. I'm beginning to think it's a domain configuration thing. We have all of our networks in one domain and segregated by project, and not by domain. e.g. When I use the openstack cli tool to list our networks all of them are listed, even though I have a project id specified in the clouds.yml

mdbooth commented 1 year ago

It sounds like you've found the root of the problem. However, if you're using pre-created subnets I would strongly encourage you to reference them by UUID instead of by filter in any case. They're unambiguous in all cases.

jichenjc commented 1 year ago

I'm beginning to think it's a domain configuration thing. We have all of our networks in one domain and segregated by project, and not by domain.

not sure this is true

use demo project I can see only those
+--------------------------------------+---------+----------------------------------------------------------------------------+
| ID                                   | Name    | Subnets                                                                    |
+--------------------------------------+---------+----------------------------------------------------------------------------+
| 77034471-c5c8-41ec-ae13-bb1189728f72 | shared  | 787911c0-7710-4a63-b51e-51ac3a78773b                                       |
| 8cfa1ce0-e0cb-466f-9e2e-7edb1a2a5911 | public  | 289297bf-057d-48de-a0f5-7d5cd27b8ca0, fc5e03ba-4578-4b94-89f7-bfb411ae6b20 |
| da9284af-b121-412a-9095-12443ec02484 | private | 6b32e1aa-5ed3-4787-b723-3c4b8788059e, d5114948-e18a-4ff7-8df4-c15cf63e6089 |
+--------------------------------------+---------+----------------------------------------------------------------------------+

but switch to admin I can see those 
+--------------------------------------+------------------------------------------------+----------------------------------------------------------------------------+
| ID                                   | Name                                           | Subnets                                                                    |
+--------------------------------------+------------------------------------------------+----------------------------------------------------------------------------+
| 057e2e25-657f-454a-a35c-9b5cfc853028 | n1                                             | df94d178-f36c-4409-b5dc-7e238f33234c                                       |
| 5425ce66-a4e5-4e1f-8ec8-30689dcd7ea4 | k8s-clusterapi-cluster-default-capi-quickstart | c64134c1-de81-443f-a0f6-99b0e309310d                                       |
....

I only have one domain..

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 year ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1387#issuecomment-1519018847): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.