openconfig / reference

This repository contains reference implementations, specifications and tooling related to OpenConfig-based network management.
Apache License 2.0
155 stars 88 forks source link

gNMI Server or Client? #93

Open shivarammysore opened 6 years ago

shivarammysore commented 6 years ago

The current way of doing gNMI is that there is a gNMI server with the Openflow switch. Can it be a gNMI client instead of a server? The reason is that for remotely managing the switch, running a client on the switch is more efficient as we don't have to punch holes in the firewall + add routing policies. Being a server it cannot initiate connections.

My only use case would be for Openflow based switches (wired and wifi) - which is my concern. Owner of the switch/access point would configure the IP address and "Management/Control" provider operator by installing the startup keys. Then, the software on boot up would do the necessary things. Owner intervention would be required if the keys expired and not rolled over.

At the more fundamental level, gNMI would be just a protocol for facilitating a secure connection. Some have used NetConf underneath to do config updates and some native operations.

How are folks really implementing and deploying this at scale today? Any thoughts or pointers to items in the spec that I have missed is greatly appreciated.

The above questions are a result of my discussion with @anarkiwi

Thanks

robshakir commented 6 years ago

The only feature request that we've seen along these lines has been focused on telemetry -- i.e., the use case whereby the collection infrastructure does not know the set of targets that it is to gather data from, and rather these clients dial in. A similar use case exists where the target might be behind NAT, such that inbound connections are not possible. AFAICR, we have never discussed this for configuration.

I'm a little unclear how the Set client could run on the target (switch) - since in all cases, presumably the configuration is supplied by some NMS somewhere? In this case, is there really a 'configuration request' RPC that you'd want, with the same RPC's response being the initial configuration that is supplied by the NMS?

If I understand you right, this behaviour is essentially what most vendors call ZTP -- there are a variety of different ways to achieve it. There isn't something directly in gNMI that addresses this -- but as part of the dial-out feature proposal, we could define something if it were of interest to enough consumers.

shivarammysore commented 6 years ago

Yes - the use case you mentioned about "target behind the NAT/Firewall where inbound connections are not possible" is exactly the use case in question.

Even on large networks, where OpenConfig/gNMI would be useful, one has to do provisioning over various links + they may be behind department firewalls. In this mode, one has to now program the firewalls too for routing policies if the switch needs to initiate connection.

Technology aside, if I am a sales guy and tell the corporate IT guy that he has to make changes to 100s of firewalls with a new policy + adopt gNMI, etc., the IT guy will write a 12-18 month project proposal and will be shot down by most CIOs.

But, if we say, the switch has gNMI support and can connect to a Datacenter deployed gNMI server and do the necessary ZTP (Zero Touch Provisioning), we have a greater chance that this can become a reality in terms of deployment and usage.

robshakir commented 6 years ago

This is certainly something that could be defined -- if there is sufficient in interest in doing so. It has not been the initial deployment model, but of course, it's a real use case. See the discussion in #42 for the streaming dial-out case.

The approach for defining such a service is really:

We would want to define this as a separate service since it keeps the implementations much cleaner - rather than having multiple modes of implementation linked to a single service definition.

shivarammysore commented 6 years ago

Thanks @robshakir Note that in this case, we are looking at the configuration too (basically the current model inverted :-( ) There is another small wrinkle to this: Once you have 2 distinct services defined, if I am a switch vendor, what should I implement? Implement both is probably not a good answer as there are test, documentation, support, etc costs + IT Service Management and Operations Management vendors only want one way to do this.

I would be happy to look at this new service specification when developed.

robshakir commented 6 years ago

You should implement what your customers want. The model that is defined for the current gNMI is implemented by vendors, and fits with a number of operational environments. If you, as a vendor, see that your market wants to operate in "dial-in" rather than "dial-out" then you should implement the right fit for you.

In my view, configuration other than at ZTP time, is typically a push operation -- all existing APIs really expect this (CLI, REST APIs, NETCONF), and the initial ZTP is really the only case for the dial-out mode there - so ZTP over gNMI could be a very small service that only enabled this initial configuration. This is a bit like the fact that the dial-out telemetry interface really is driven by configuration - and is just a channel to send Notification messages really. One can't deprecate dial-in for config, because this ends up needing something complex for a target to figure out it needs to ask for new config.

Note that the product management decision doesn't get any easier if you have one gRPC service -- you still have to test and support the different RPCs under that service regardless. Operationally, it's harder if we have one service, because we now have more degrees of freedom within that implementation such that testing and compliance with an operator's requirements becomes harder.

Google's use case is focused towards dial-in for telemetry and configuration, such that this has been our initial focus for the specification. As per comments elsewhere, we're committed to contributing to the open source implementations and efforts around the wider ecosystem - but haven't had a use case for either of the dial-in services yet. We welcome community contributions, especially from those for whom these approaches are a higher priority.