networkservicemesh / deployments-k8s

Apache License 2.0
42 stars 34 forks source link

Investigate floating interdomain tests failures after host port change #10221

Open dualBreath opened 1 year ago

dualBreath commented 1 year ago

Investigate problem after removing host ports

Patch: https://github.com/networkservicemesh/deployments-k8s/commit/3ebf1bf0881f6851d205b863d97bd3ff32bce9f7

Failed tests: https://github.com/networkservicemesh/integration-k8s-kind/actions/runs/6639323018/job/18037449097?pr=903

PR: https://github.com/networkservicemesh/deployments-k8s/pull/10142

Commit that should be selected: https://github.com/networkservicemesh/integration-tests/commit/251792d17dbeb9cc3d761e32d47e233d563486b4

Expected hash in ../integration-tests/extensions/base/suite.go: 45886914d594d4e4b78b698ce08ee68f333c8d69 Commit: 251792d17dbeb9cc3d761e32d47e233d563486b4

Link to check hash (host port should not be present in the config): https://github.com/networkservicemesh/deployments-k8s/blob/45886914d594d4e4b78b698ce08ee68f333c8d69/apps/nsmgr-proxy/nsmgr-proxy.yaml

dualBreath commented 1 year ago

The problem after deleting the Host Port occurs in the registry. Because we always resolve the IP as the node's IP address, for example 172.18.0.4:5004 instead of using the service address. We get the destination IP address for the manager and registry from the IP map. This map contains an entry like 10.244.1.23: 172.18.0.4, but for this to work in the registry, there needs to be something like 10.244.1.23: 172.18.0.130. This seems to be ok for the manager, but wrong for the registry when the Host Port is closed. So the possible solution in the current architecture is to create separate IP map for registry. But we have a separate issue for improvement in this area: Get rid of using '@' in interdomain NSM scenarious. The solutions seem to require a comparable amount of work, so the second option seems preferable

dualBreath commented 1 year ago

Сurrent task is blocked and is waiting for another task to be resolved: https://github.com/networkservicemesh/sdk/issues/1507