openconfig / ondatra

54 stars 34 forks source link

Wrong port mapping in knebind.go #106

Open cakhil45 opened 1 week ago

cakhil45 commented 1 week ago

There is a bug (in Ondatra) and it is assuming DevicePort as the InsidePort here: https://github.com/openconfig/ondatra/blob/main/knebind/knebind.go#L486

This works for Arista and Nokia because they use the same outside and inside g* service port unlike Juniper. Where as Juniper's inside and outside ports are different: https://github.com/openconfig/featureprofiles/blob/main/topologies/kne/arista/ceos/topology.textproto#L40 https://github.com/openconfig/featureprofiles/blob/main/topologies/kne/nokia/srlinux/topology.textproto#L23

Because of this facing problems in running the test: https://github.com/cakhil45/featureprofiles/blob/main/feature/system/tests/system_base_test/g_protocol_test.go

g_protocol_test.go:81: DUT is not listening on correct port for "g*": got 32767, want 9339

bstoll commented 1 week ago

We don't share container IP addresses I'm not sure if we are using k8s services the way you typically do. I believe you are technically correct that we should look at the outside port, but since KNE doesn't share inside container IPs I also think the inside ports should always be correct.

On my container I see it listening on the right ports - it seems like the KNE default service map just isn't using them:

root@ncptx> show system connections | grep jsd | match LISTEN
tcp        0      0 0.0.0.0:50687           0.0.0.0:*               LISTEN      2322/jsd
tcp6       0      0 128.0.0.4:61525         :::*                    LISTEN      2322/jsd
tcp6       0      0 :::32767                :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9559                 :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9340                 :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9339                 :::*                    LISTEN      2322/jsd

Should we instead have the default KNE config map these ports back to the default ports so we can use 1:1 mappings on the services?

https://github.com/openconfig/kne/blob/66d2fa4831f71c078f86ac189d7364e7e020bcfc/topo/node/juniper/juniper.go#L588-L599

nitinsoniism commented 1 week ago

@bstoll InsidePort will only be visible inside the pod/container. Using InsidePort for external access defeats the purpose of port forwarding. Last I checked, Juniper devices use just 1 port (InsidePort 32767 here) for all g* services and KNE provides a mechanism to hide 1 InsidePort behind many OutsidePort as seen here: https://github.com/openconfig/kne/blob/main/proto/topo/topo.pb.go#L334 // Multiple external services can be mapped to a single internal service port

I believe we should only rely on OutsidePort and we should change this: https://github.com/openconfig/ondatra/blob/main/knebind/knebind.go#L486

DevicePort: int(svcPb.GetInside()), to this: DevicePort: int(svcPb.GetOutside()),

Tagging @alexmasi from KNE. Also, we should not have seen ports 9559, 9340 and 9339 in the ncptx container. Seems like something has changed here since I spun up ncptx last. If these g* services are split internally (I need to check), we will change the KNE but irrespective, only OutsidePort should be dialed.

Edit: Looks like ncptx now does listen on standard IANA ports 9559, 9340 and 9339. A recent behavior change. Originally it used to listen only on the inside port of 32767 so we will change the KNE ncptx inside<->outside port mapping accordingly but this bug should still be fixed - Outside port should be dialed instead of Inside port.

alexmasi commented 1 week ago

Outside port is what gets dialed in the binding: https://github.com/openconfig/ondatra/blob/c5b17abcd56dc1cd3187c6a0324b133c3976a71b/knebind/knebind.go#L302

I am not familiar with the introspect dialer and all of its fields, so the intent of DevicePort would be useful to know here @bstoll . From what I remember the inside port for a node in KNE is what we were interested in validating for IANA correctness: https://github.com/openconfig/featureprofiles/pull/2264. That is the vendor container (Juniper or other) should support the IANA ports natively, the inside/outside piece is specific to KNE and in many cases we are using outside to be able to easily change ports if there is a service that hardcodes a port (potentially following IANA numbering for example). All vendors should aim for IANA port numbers for the container itself (which is the inside port). @marcushines for more info

nitinsoniism commented 1 week ago

@alexmasi there are 2 observations here. First one is a possible ondatra bug described/raised here. In summary, tests here fail on ncptx: https://github.com/cakhil45/featureprofiles/blob/main/feature/system/tests/system_base_test/g_protocol_test.go#L45

Error: g_protocol_test.go:81: DUT is not listening on correct port for "g*": got 32767, want 9339 DevicePort is 32767 which is populated from ncptx InsidePort.

dialer := introspect.DUTDialer(t, dut, svc)
    if dialer.DevicePort != int(wantPort) {
        t.Fatalf("DUT is not listening on correct port for %q: got %d, want %d", svc, dialer.DevicePort, wantPort)
    }

DevicePort is populated using GetInside() which does not seem correct. GetOutside() should have been used. You would not want to dial hidden port which is forwarded (even though both ports are the same). If the requirement is to make sure vendors use standard IANA ports natively, then probably better would be to check whether the Inside and Outside ports are the same.

func makeDialer(svcPb *tpb.Service, opts ...grpc.DialOption) *introspect.Dialer {
    return &introspect.Dialer{
        DialFunc: func(ctx context.Context, addr string, opts ...grpc.DialOption) (*grpc.ClientConn, error) {
            ctx, cancel := context.WithTimeout(ctx, grpcDialTimeout)
            defer cancel()
            return grpc.DialContext(ctx, addr, opts...)
        },
        DialTarget: serviceAddr(svcPb),
        DialOpts:   opts,
        DevicePort: int(svcPb.GetInside()),
    }
}

The second observation is Juniper's usage of just 1 port (32767) which last I checked is being addressed and t seems gRPC service has already been split into standard IANA ports but requires KNE's inside-outside port mappings corrected for ncptx.

root@ncptx> show system connections | grep jsd | match LISTEN
tcp        0      0 0.0.0.0:50687           0.0.0.0:*               LISTEN      2322/jsd
tcp6       0      0 128.0.0.4:61525         :::*                    LISTEN      2322/jsd
tcp6       0      0 :::32767                :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9559                 :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9340                 :::*                    LISTEN      2322/jsd
tcp6       0      0 :::9339                 :::*                    LISTEN      2322/jsd

Since the gRPC services in ncptx are now split into respective IANA ports, If we change the ncptx port mapping in ncptx this will get resolved but still, dialing InsidePort doesn't seem correct.