antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.66k stars 366 forks source link

Inconsistent validation of ExternalIPPool while creation of Egress resource #6718

Open jainpulkit22 opened 5 days ago

jainpulkit22 commented 5 days ago

Describe the bug

While creating Egress resource if we provide both egressIP and externalIPPool, then the validator checks that whether the IPPool exists or not and if it does not exist then the egress is not created, but on the other hand if we provide only externalIPPool the validator does not checks that whether the IPPool is existing or not and the egress is created, and later when the IPPool is created the IP is assigned to the egress.

To Reproduce

Create two egress resources one with both egressIP and externalIPPool, and other with only externalIPPool specified, but make sure that externalIPPool does not exist in the environment, the latter one will be created successfully while the former one will give validation error.

Expected

The behaviour should be consistent i.e in both the cases it should be validated that whether the externalIPPool exists or not, and instead of validation error and rejecting the egress in case both egressIP and externalIPPool are provided it should just dispaly a message or event that externalIPpool does not exist, and later when user creates the IPPool it should sync the egress just in a similar manner as it happens in case when only externalIPPool is specified.

Actual behavior

apiVersion: crd.antrea.io/v1beta1
kind: Egress
metadata:
  name: egress-prod2
  namespace: kube-system
spec:
  appliedTo:
    namespaceSelector:
      matchLabels:
        kubernetes.io/metadata.name: kube-system
  externalIPPool: external-ip-pool1

when I applied this crd external-ip-pool1 was not existing but the egress was created successfully without any validation error or message.

Screenshot 2024-10-07 at 10 18 50 AM

Now, when I tried to apply the following egress:

apiVersion: crd.antrea.io/v1beta1
kind: Egress
metadata:
  name: egress-prod2
  namespace: kube-system
spec:
  appliedTo:
    namespaceSelector:
      matchLabels:
        kubernetes.io/metadata.name: kube-system
  externalIPPool: external-ip-pool1
  egressIP: 169.254.0.4

The egress creation was rejected with the following message:

Screenshot 2024-10-07 at 10 36 03 AM
antoninbas commented 4 days ago

At first glance, I tend to agree with you. It seems wrong to have an inconsistent behavior between both cases.

The behaviour should be consistent i.e in both the cases it should be validated that whether the externalIPPool exists or not, and instead of validation error and rejecting the egress in case both egressIP and externalIPPool are provided it should just dispaly a message or event that externalIPpool does not exist, and later when user creates the IPPool it should sync the egress just in a similar manner as it happens in case when only externalIPPool is specified.

That should be discussed though. Maybe strict validation is a good thing here, and we could always check that the provided ExternalIPPool exists, regardless of whether an EgressIP is provided or not.

When I look at the Egress controller code, it seems that the original intent was indeed to have some validation present: https://github.com/antrea-io/antrea/blob/ab32250d5a1c3103acb0f07fb4a67a2357a7930e/pkg/controller/egress/controller.go#L289-L296

Of course, the TODO is not up-to-date since the current code does indeed validate that the provided EgressIP falls in the provided ExternalIPPool.

However, if we want to have validation, we should probably have some improvements. Today, it is possible to delete an ExternalIIPool regardless of whether or not it is referenced by an Egress resource: https://github.com/antrea-io/antrea/blob/ab32250d5a1c3103acb0f07fb4a67a2357a7930e/pkg/controller/externalippool/validate.go#L76-L80

That doesn't seem compatible with strict validation on Egress creation.

That being said, loose coupling of resources tend to be preferred in K8s, so maybe no validation - as you suggested - would be the better approach. We should think about UX though: if a user provides an Egress with an ExternalIPPool and an Egress IP that is not part of the pool, maybe we could add an error condition to the Egress Status to reflect the issue.

cc @tnqn

jainpulkit22 commented 4 days ago

That being said, loose coupling of resources tend to be preferred in K8s, so maybe no validation - as you suggested - would be the better approach. We should think about UX though: if a user provides an Egress with an ExternalIPPool and an Egress IP that is not part of the pool, maybe we could add an error condition to the Egress Status to reflect the issue.

Yes, we can log the validation related errors i.e either externalIPPool does not exist of EgressIP is not part of the pool in status or as an event in the egress resource that will be created. The only challenging thing is that in normal case when only externalIPPool is provided there is no issue, whenever it is created the egress will be synced, but in case of egressIP and externalIPPool if the IPPool is created afterwards we need to revalidate it to ensure that egressIP forms a part of externalIPPool.