submariner-io / submariner

Networking component for interconnecting Pods and Services across Kubernetes clusters.
https://submariner.io
Apache License 2.0
2.4k stars 188 forks source link

Headless service without selector doesn't work #1537

Closed mkimuram closed 2 years ago

mkimuram commented 3 years ago

What happened: Headless service without selector doesn't work.

What you expected to happen: Headless service without selector does work

How to reproduce it (as minimally and precisely as possible):

cat << EOF | kubectl apply -f - apiVersion: v1 kind: Endpoints metadata: name: test-vm subsets:

subctl export service -n default test-vm

- Check that globalingressip for it is assigned

kubectl get globalingressip test-vm Error from server (NotFound): globalingressips.submariner.io "test-vm" not found


**Anything else we need to know?**:

- Logs from globalnet pod at that time

kubectl logs -n submariner-operator -l app=submariner-globalnet I0819 14:24:20.249605 1 service_export_controller.go:145] Processing ServiceExport "default/test-vm" I0819 14:24:20.350946 1 ingress_pod_controller.go:91] Created Pod controller for (default/test-vm) with selector "" I0819 14:24:20.442889 1 ingress_pod_controller.go:123] "create" ingress Pod default/cl1 for service test-vm I0819 14:24:20.480555 1 global_ingressip_controller.go:124] Processing created &v1.GlobalIngressIP{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"pod-cl1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"2ec8fde4-e79d-4244-b01a-6dded600b580", ResourceVersion:"1347889", Generation:1, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63764979860, loc:(time.Location)(0x2021ea0)}}, DeletionTimestamp:(v1.Time)(nil), DeletionGracePeriodSeconds:(int64)(nil), Labels:map[string]string{"submariner.io/serviceRef":"test-vm"}, Annotations:map[string]string{"submariner.io/headless-svc-pod-ip":"10.42.0.13"}, OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"submariner-globalnet", Operation:"Update", APIVersion:"submariner.io/v1", Time:(v1.Time)(0xc0000a7d58), FieldsType:"FieldsV1", FieldsV1:(v1.FieldsV1)(0xc0000a7de8)}}}, Spec:v1.GlobalIngressIPSpec{Target:"HeadlessServicePod", ServiceRef:(v1.LocalObjectReference)(0xc000d46ae0), PodRef:(v1.LocalObjectReference)(0xc000d46af0)}, Status:v1.GlobalIngressIPStatus{Conditions:[]v1.Condition(nil), AllocatedIP:""}} I0819 14:24:20.480976 1 global_ingressip_controller.go:147] Allocating global IP for "default/pod-cl1" I0819 14:24:20.481089 1 iface.go:145] Installing iptables rule for Headless SVC Pod -d 242.0.255.253 -j DNAT --to 10.42.0.13 I0819 14:24:20.598690 1 iface.go:227] Installing iptable egress rules for HDLS SVC Pod "default/pod-cl1": -p all -s 10.42.0.13 -m mark --mark 0xC0000/0xC0000 -j SNAT --to 242.0.255.253 I0819 14:24:20.614227 1 base_controllers.go:194] Updated: &v1.GlobalIngressIPStatus{Conditions:[]v1.Condition{v1.Condition{Type:"Allocated", Status:"True", ObservedGeneration:0, LastTransitionTime:v1.Time{Time:time.Time{wall:0xc03fb845249ba30e, ext:851262680326300, loc:(time.Location)(0x2021ea0)}}, Reason:"Success", Message:"Allocated global IP"}}, AllocatedIP:"242.0.255.253"}



According to the above log, it seems to be handled as [ingress pod](https://github.com/submariner-io/submariner/blob/devel/pkg/globalnet/controllers/ingress_pod_controller.go#L123), not a service without selector. Maybe we need to add a separete logic for headless service without selector, [here](https://github.com/submariner-io/submariner/blob/59f656555db0c2f18b691b7ab1a28edb34f2f43f/pkg/globalnet/controllers/service_export_controller.go#L218)? (There seems the logic only for headless service __with__ selector).

**Environment**:
- Diagnose information (use `subctl diagnose all`):
- Gather information (use `subctl gather`):
- Cloud provider or hardware configuration:
- Install tools:
- Others:
mkimuram commented 3 years ago

@sridhargaddam

Opened the issue that I explained in the meeting and here. As you pointed out, globalingressip is not assigned.

mkimuram commented 3 years ago

Sharing the easiest steps to create a test environment for this issue.

make e2e using=external-net,globalnet

(The E2E test itself may fail due to multiple source IPs are assigned as a default global EgressIP. I will fix the test.)

And below command will tear it down.

make cleanup using=external-net,globalnet

Below steps are the test for "service without selector" for "From cluster to external app". To modify it to test "headless service without selector", we just need to add clusterIP: None to Service in step 2.

[From cluster to external app] 1. http server is already running on ext-app docker container: ``` docker ps ``` 2. Create service without selector and endpoints, and export service on cluster1 ``` docker inspect "ext-app" -f '{{(index .NetworkSettings.Networks "'pseudo-ext'").IPAddress}}' ip=$(docker inspect "ext-app" -f '{{(index .NetworkSettings.Networks "'pseudo-ext'").IPAddress}}') export KUBECONFIG=output/kubeconfigs/kind-config-cluster1 cat << EOF | kubectl apply -f - apiVersion: v1 kind: Service metadata: name: test-vm spec: ports: - protocol: TCP port: 80 targetPort: 80 EOF cat << EOF | kubectl apply -f - apiVersion: v1 kind: Endpoints metadata: name: test-vm subsets: - addresses: - ip: ${ip} ports: - port: 80 EOF subctl export service -n default test-vm ``` 3. Confirm global egress IP ``` kubectl get globalingressip test-vm -o jsonpath='{.status.allocatedIP}{"\n"}' ``` 4. Test access from cluster1 ``` export KUBECONFIG=output/kubeconfigs/kind-config-cluster1 kubectl -n default run tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- bash curl 242.254.1.253 ``` 5. Test access from cluster2 ``` export KUBECONFIG=output/kubeconfigs/kind-config-cluster2 kubectl -n default run tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- bash curl 242.254.1.253 ``` 6. Confirm the log in ext-app container ``` docker logs ext-app ``` [From external app to cluster] (this may be out of the scope of this issue itself, but the source IP of this access is the point of interest) 1. Create deployment and service, and export service on cluster2 ``` export KUBECONFIG=output/kubeconfigs/kind-config-cluster2 kubectl -n default create deployment nginx --image=k8s.gcr.io/nginx-slim:0.8 kubectl -n default expose deployment nginx --port=80 subctl export service --namespace default nginx ``` 2. Confirm global egress IP ``` kubectl get globalingressip nginx -o jsonpath='{.status.allocatedIP}{"\n"}' ``` 3. Test access from ext-app ``` docker exec -it ext-app bash curl 242.254.2.253 ``` 4. Confirm the log in the pod ``` kubectl logs -l app=nginx ``` (Tests for StatefulSet can be done by similar modification)
aviweit commented 3 years ago

I have a similar case to what @mkimuram reports where my headless Service with no selector contains an implicit Endpoint with an ipaddress of a pod in multus network - net1 interface. That network is pre-established independently of submariner and spans both of my clusters.

Is it possible to expose that service so that I can refer, from the other cluster, to pod's 2nd interface ipaddress via its service name ? e.g. to have my "session-management" pod to connect to my "userplane-function" pod via userplane-function-1.my-namespace.svc.clusterset.local where both pods on same multus network?

I tried to do so but I can not see the A record via dig.

Please note that submariner is installed on both of my cluster with none overlapped ips.

Thanks.

mkimuram commented 3 years ago

@aviweit

where my headless Service with no selector contains an implicit Endpoint with an ipaddress of a pod in multus network - net1 interface.

Interesting use case. Thank you for sharing it. It seems the same issue from a different use case. Its DNS part of the issue is also discussed in lighthouse#603.

This issue is for connectivity of "headless service without selector" and lighthouse#603 is for discovery of "service without selector". There is a workaround for discovery of "service without selector", so just for "service without selector", not for "headless service without selector", only the workaround for lighthouse#603 would work.

sridhargaddam commented 2 years ago

Removed this from the Project board as we are tracking the corresponding PR.