openconfig / gnmi-gateway

A modular, distributed, and highly available service for modern network telemetry via OpenConfig and gNMI
Apache License 2.0
140 stars 32 forks source link

"NoTLS:yes" ignored #14

Closed thomarite closed 4 years ago

thomarite commented 4 years ago

Hi there

I am trying to test gnmi-gateway against an Arista vEOS switch. I have the Arista side configured without TLS in the gnmi side but gnmi-gateway is still trying to negotiate TLS although the targets.json file says to ignore it

This the logging from gnmi-gateway

go:1.14.6|py:3.7.3|tomas@vm2:~/gnmi/gnmi-gateway release$ GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info ./gnmi-gateway -TargetLoaders=json -TargetJSONFile=./examples/gnmi-prometheus/targets.json -EnableGNMIServer -Exporters=prometheus -OpenConfigDirectory=./oc-models/ -ServerTLSCert=server.crt -ServerTLSKey=server.key
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Starting GNMI Gateway."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Clustering is NOT enabled. No locking or cluster coordination will happen."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Starting connection manager."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Starting gNMI server on 0.0.0.0:9339."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Starting Prometheus exporter."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Connection manager received a target control message: 1 inserts 0 removes"}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Initializing target gcp-r1 ([192.168.249.4:3333]) map[NoTLS:yes]."}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Target gcp-r1: Connecting"}
{"level":"info","time":"2020-11-07T19:37:59Z","message":"Target gcp-r1: Subscribing"}
INFO: 2020/11/07 19:37:59 parsed scheme: ""
INFO: 2020/11/07 19:37:59 scheme "" not registered, fallback to default scheme
INFO: 2020/11/07 19:37:59 ccResolverWrapper: sending update to cc: {[{192.168.249.4:3333  <nil> 0 <nil>}] <nil> <nil>}
INFO: 2020/11/07 19:37:59 ClientConn switching balancer to "pick_first"
INFO: 2020/11/07 19:37:59 Channel switches to new LB policy "pick_first"
INFO: 2020/11/07 19:37:59 Subchannel Connectivity change to CONNECTING
INFO: 2020/11/07 19:37:59 Subchannel picks a new address "192.168.249.4:3333" to connect
INFO: 2020/11/07 19:37:59 pickfirstBalancer: UpdateSubConnState: 0xc0005aa270, {CONNECTING <nil>}
INFO: 2020/11/07 19:37:59 Channel Connectivity change to CONNECTING
WARNING: 2020/11/07 19:37:59 grpc: addrConn.createTransport failed to connect to {192.168.249.4:3333 192.168.249.4:3333 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake". Reconnecting...
INFO: 2020/11/07 19:37:59 Subchannel Connectivity change to TRANSIENT_FAILURE
INFO: 2020/11/07 19:37:59 pickfirstBalancer: UpdateSubConnState: 0xc0005aa270, {TRANSIENT_FAILURE connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"}
INFO: 2020/11/07 19:37:59 Channel Connectivity change to TRANSIENT_FAILURE
{"level":"info","time":"2020-11-07T19:38:00Z","message":"Starting Prometheus HTTP server."}
INFO: 2020/11/07 19:38:00 Subchannel Connectivity change to CONNECTING
INFO: 2020/11/07 19:38:00 Subchannel picks a new address "192.168.249.4:3333" to connect
INFO: 2020/11/07 19:38:00 pickfirstBalancer: UpdateSubConnState: 0xc0005aa270, {CONNECTING <nil>}
INFO: 2020/11/07 19:38:00 Channel Connectivity change to CONNECTING
WARNING: 2020/11/07 19:38:00 grpc: addrConn.createTransport failed to connect to {192.168.249.4:3333 192.168.249.4:3333 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake". Reconnecting...

This is targets.json

go:1.14.6|py:3.7.3|tomas@vm2:~/gnmi/gnmi-gateway release$ cat examples/gnmi-prometheus/targets.json 
{
  "request": {
    "default": {
      "subscribe": {
        "prefix": {
        },
        "subscription": [
          {
            "path": {
              "elem": [
                {
                  "name": "interfaces"
                }
              ]
            }
          }
        ]
      }
    }
  },
  "target": {
    "gcp-r1": {
      "addresses": [
        "192.168.249.4:3333"
      ],
      "credentials": {
        "username": "xxx",
        "password": "xxx"
      },
      "request": "default",
      "meta": {
        "NoTLS": "yes"
      }
    }
  }
}

This is the Arista say seeing TLS packets:

bash-4.2# tcpdump -i any "tcp port 3333 and (tcp[((tcp[12] & 0xf0) >> 2)] = 0x16)"

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
19:47:01.367197  In 1e:3d:5b:13:d8:fe (oui Unknown) ethertype IPv4 (0x0800), length 320: 10.128.0.4.50486 > 192.168.249.4.dec-notes: Flags [P.], seq 2715923852:2715924104, ack 2576249027, win 511, options [nop,nop,TS val 1194424180 ecr 1250876], length 252
19:47:02.405870  In 1e:3d:5b:13:d8:fe (oui Unknown) ethertype IPv4 (0x0800), length 320: 10.128.0.4.50488 > 192.168.249.4.dec-notes: Flags [P.], seq 680803294:680803546, ack 3839769659, win 511, options [nop,nop,TS val 1194425218 ecr 1251136], length 252
19:47:04.139458  In 1e:3d:5b:13:d8:fe (oui Unknown) ethertype IPv4 (0x0800), length 320: 10.128.0.4.50490 > 192.168.249.4.dec-notes: Flags [P.], seq 3963338234:3963338486, ack 1760248652, win 511, options [nop,nop,TS val 1194426952 ecr 1251569], length 252

This is my gnmi arista config:

r1#show management api gnmi 
Enabled:            Yes
Server:             running on port 3333, in MGMT VRF
SSL Profile:        none
QoS DSCP:           none
r1#

!
management api gnmi
   transport grpc GRPC
      port 3333
      vrf MGMT
!

I am just following https://github.com/openconfig/gnmi-gateway/tree/release/examples/gnmi-prometheus

I can confirm gnmi works in my vEOS following https://netdevops.me/2020/arista-veos-gnmi-tutorial/

Let me know if you need me to provide more info.

I have the same issue building gnmi-gateway with "build" and "docker"

Thanks

colinmcintosh commented 4 years ago

Hi @thomarite!

The NoTLS flag is actually a "don't verify TLS" flag. The gNMI specification states that

The session between the client and server MUST be encrypted using TLS - and a target or client MUST NOT fall back to unencrypted sessions.

as such we will always create the connection with a TLS transport, with the option to disable verification. I'm actually surprised that the Arista router started the gNMI server without TLS credentials. You should be able to get up and running with a self-signed certificate on your Arista router which you can generate with these commands:

conf t
security pki certificate generate self-signed cvp.crt key cvp.key generate rsa 2048 validity 30000 parameters common-name cvp
!
management api gnmi
    transport grpc GRPC
        ssl profile SELFSIGNED
!
management security
    ssl profile SELFSIGNED
        certificate cvp.crt key cvp.key

This could probably use some better documentation: that TLS is required and that the NoTLS flag still initiates a TLS session but without verification. I'm thinking the NoTLS flag should also be renamed/(or aliased) to NoTLSVerify to be more clear.

I'll give more thought as well to possibly including the option to completely disable TLS for interoperability purposes given that it seems some implementations of gNMI targets support that.

thomarite commented 4 years ago

Thanks @colinmcintosh for the quick answer and clarification! I have followed your instructions and everything works fine. Yes, I think a clarification about the purpose of NoTLS could help to avoid confusions.

Anyway, it is a great tool what you have done! I will keep playing with it.

gsl-rosst commented 1 year ago

I'll give more thought as well to possibly including the option to completely disable TLS for interoperability purposes given that it seems some implementations of gNMI targets support that.

I would appreciate this - Arista EOS does not require TLS, and I was querying devices successfully with gnmic and telegraf with no TLS. I was confused why gnmi-gateway would not work until I discovered this issue. Now I have to go generate self-signed certs on all my devices, or use different software.