ansible / proposals

Repository for sharing and tracking progress on enhancement proposals for Ansible.
Creative Commons Zero v1.0 Universal
93 stars 19 forks source link

Generic Netconf modules for managing Netconf enabled network devices #104

Closed ganeshrn closed 6 years ago

ganeshrn commented 6 years ago

Proposal:

Generic Netconf modules to support network platforms that are Netconf enabled Author: Ganesh B. Nalawade @ganeshrn

Date: 2018/04/11

Motivation

Describe the reasons for this proposal. Provide generic Netconf modules that can communicate with the network device that supports Netconf

Problems

What problems exist that this proposal will solve?

Solution proposal

- name: Get confgiuration and state data
   netconf_get:

- name: Get configuration data from candidate datastore state
   netconf_get:
     source: candidate

- name: Get system configuration data from running datastore state
   netconf_get:
      source: running
      filter: <configuration><system></system></configuration>

- name: Get confgiuration and state data in json format
   netconf_get:
     display: json

- name: Get system confgiuration information using xpath filter
   netconf_get:
     source: running
     display: json
     filter: configuration/system

- name: Take backup of current configuration and edit configuration on target datastore
  netconf_config:
    content:  |
        <config xmlns:xc="urn:ietf:params:xml:ns:netconf:base:1.0">
            <system xmlns="urn:ietf:params:xml:ns:yang:ietf-system">
                <ntp>
                    <enabled>true</enabled>
                    <server>
                        <name>ntp1</name>
                        <udp><address>127.0.0.1</address></udp>
                    </server>
                </ntp>
            </system>
        </config>
    target: running
    confirm: 0
    backup: True

- name: confirm a previous commit
  netconf_config:
    confirm_commit: True

- name: Copy configuration in running datastore to the remote host
   netconf_config: 
     target: https://user:password@example.com/cfg/new.txt
     source: running
     copy: True

- name: Execute lock on target datastore
  netconf_rpc:
    rpc: lock
    content: <target><candidate/></target>

- name: Execute platform specifc RPC 
  netconf_rpc:
     rpc: get-interface-information

Testing (optional)

Documentation (optional)

privateip commented 6 years ago

Updating here as well. Mentioned in IRC meeting that, imho, the following changes should be made:

config becomes content and drop src argument - use lookup plugins for loading content from disk data becomes content

in both cases, this better aligns the module arguments with names across they ansible ecosystem

lpenz commented 6 years ago

Maybe we could also have a command for every standard netconf rpc besides the generic one.

For instance, in addition to

- name: Execute lock on target datastore
  netconf_rpc:
    rpc: lock
   data: <target><candidate/></target>

have something like:

- name: Execute lock on target datastore
  netconf_rpc_lock:
    target: candidate

There might even be additional value, as they can be smarter with regards to idempotency. In this case, the specific command can avoid locking candidate twice, while netconf_rpc can just be a passthru.

ganeshrn commented 6 years ago

@lpenz That's a good idea. There can be netconf_lockmodule.

netconf_lock module: The module will be used to lock/unlock the target datastore. Options:

  1. target: Name of the configuration datastore to be locked or unlocked. Valid values are auto, running, startup or remote URL. auto, uses candidate and fallback to running. Default value is auto.
  2. lock: This is a boolean value and identifies if the target datastore should be locked or unlocked. If the value is True, datastore is locked if not locked previously. If the value is False target datastore is unlocked if previously not unlocked. The default value is True.
- name: lock on candidate datastore
  netconf_lock:
    target: candidate

- name: unlock on candidate datastore
  netconf_lock:
    target: candidate
    lock: False

All the other standard Netconf operations can be executed using netconf_get and netconf_config.

Thoughts/suggestions?

ganeshrn commented 6 years ago

@privateip Update proposal incorporating your review comments.

privateip commented 6 years ago

@lpenz @ganeshrn I worry we are exposing way to much of the netconf api here. The playbook writer would need intimate knowledge of netconf if we externalize the locking mechanism into a separate module as opposed to building it into the module on an as needed basis. That is a bit of an anti-pattern for Ansible modules

ganeshrn commented 6 years ago

@privateip That makes sense, netconf_config have the lock/unlock mechanism build into the module. So, for now, will go ahead with netconf_config, netconf_get and netconf_rpc modules. If there is a valid use case for using lock/unlock independently we can revisit this later.

ganeshrn commented 6 years ago

Moving to agreed as discussed in irc meeting https://meetbot.fedoraproject.org/ansible-network/2018-04-25/network_working_group.2018-04-25-16.00.log.html

wisotzky commented 6 years ago

General remark: I think there is a dilemma here which needs to be solved. Depending on the actual use-case, it absolutely makes sense to expose NETCONF functionality to the playbook author. Having a netconf_lock module is a reasonable good example for this - and I would claim the same to be true about committing changes.

Example Scenario A playbook author wants to change the port configuration, for example enabling EFM OAM and LLDP on all NNI ports network-wide. To do this, typically you would first read-out the list of NNI ports using netconf_get and afterwards you would apply the change using netconf_config. In the logic of NETCONF (rfc6241) you want to make sure, that nobody changes the datastore from the read to the edit/commit action. For this, the lock must be taken before the read and released after committing the changes.

Current implementation: lock > get-config > unlock > lock > edit-config > commit > unlock

Desired implementation: lock > get-config > edit-config > commit > unlock

While it was impossible to do this using the original approach connection=local, with connection=NETCONF the connection is now persisted over multiple NETCONF related tasks and so can the lock.

Conclusion would be, that the playbook author might decide either to use explicit lock using a new netconf_lock module or implicit lock embedded in the (existing) netconf_get and netconf_config modules.

For those netconf related modules (get/config) it would basically mean, that we either need to expose an explicit "lock" attribute (default: True) - to prevent the module to take/release the lock (by setting lock: False). The other option would be, to check at the beginning of the playbook if the datastore is already locked by this netconf client - and to not do it again. Most flexible solution is likely a mix of both approaches, having an option lock with values True, False and Auto (default: auto).

wisotzky commented 6 years ago

Comments on proposed changes on netconf_config

Also, as discussed in my previous comment, it might be beneficial to provide more control over discard-changes/commit. This would allow to serialize multiple tasks, having more efficient use of commit operation (typically expensive operation in most router OS implementations).

wisotzky commented 6 years ago

Request to add a new "netconf_audit" module

Functionality should be in-line with the "nso_verify" module. Technically, there is no need to have Cisco NSO for this. Question is, if the module needs to be YANG aware. Likely the advantage of being YANG-aware would be, that we could properly render data-types integer/string and we could distinguish better between list-keys and regular leaf attributes.

lpenz commented 6 years ago

We do have this tension between ansible's idempotency principle and exposing more of netconf's functionality. It might be worth making a distinction between "porcelain" modules that are more aligned with ansible and modules that allow access to lower level functionality. Maybe just adding a note that netconf_rpc and company are for advanced users is enough.

wisotzky commented 6 years ago

@lpenz @ganeshrn @privateip

Please have a check on #40198 as this also includes now support for netconf_rpc.

In my view, we don't need any other specific modules for tasks like lock, unlock, discard, commit, delete-config, copy-config, validate, kill-session, close-session and get-schema anymore. The new netconf_rpc module can easily be used for all of that cases, so having dedicated modules for those cases, might not really add value.

So the main remaining tasks are:

Also per my comment from May 9th, we should work on a validation/audit function (similar to the Cisco NSO modules). The main idea is, to check config/state attributes for existence and values. This could be used for tasks like security audits, configuration resync audits and pre/post upgrade/migration validation.

ganeshrn commented 6 years ago

Implemented by: https://github.com/ansible/ansible/pull/44379 https://github.com/ansible/ansible/pull/39869 https://github.com/ansible/ansible/pull/40358

wisotzky commented 6 years ago

@ganeshrn, I've just done a quick review on PR44379 which contains the new implementation for netconf_config. I need to say, that my comment from May 9 is still absolutely valid. I do agree to @lpenz, that Ansible modules are not just to be built around APIs to expose all it functionality.

If the target of Ansible modules is to provide access in an abstracted way, making smart decisions and combining multiple API calls with reason - I don't see this to be happen with the new version of the netconf_config module. It is rather the opposite, providing full access to all the details of NETCONF. Combining all the 3 functions (edit-config, copy and delete) does not make much sense - as those operations are mutually exclusive. Therefore effort must be taken to keep the use-cases disjoint. Also people might struggle a lot with the fact, that we are not using exactly the same attribute names as in the NETCONF specification (rfc6241).

Having this said, combining some NETCONF RPCs makes sense: lock - edit-config running - copy running > startup - unlock lock - edit-config candidate - commit - copy running > startup - unlock lock - edit-config candidate - validate - discard - unlock

In those scenarios lock/unlock and copy running>startup are optional.

I am not convinced that such advanced use-cases exist for copy-config or delete. I've not really seen solutions, which execute (un)lock for executing the copy-config or delete operation.

At least as playbook author I would be confused about all the options provided in the new netconf_config module. So they must understand rfc6241 as a foundation and than our take on this. This is not an easy ask - so I would consider the changes made as not enough.

ganeshrn commented 6 years ago

Combining all the 3 functions (edit-config, copy and delete) does not make much sense - as those operations are mutually exclusive. Therefore effort must be taken to keep the use-cases disjoint.

This 3 opertaion are mutually exclusive in modules as well and handled it using mutually_exclusive clause within the module

Also people might struggle a lot with the fact, that we are not using exactly the same attribute names as in the NETCONF specification (rfc6241)

Which option are you referring to?

Having this said, combining some NETCONF RPCs makes sense: lock - edit-config running - copy running > startup - unlock lock - edit-config candidate - commit - copy running > startup - unlock lock - edit-config candidate - validate - discard - unlock

All these scenarios can be achieved using the existing module.

vparames86 commented 5 years ago

Currently in Ansible 2.7, There is no retry logic for datastore lock. In case of Cisco IOS-XE, There is a local synchronization process in the router which locks the datastore temporarily and we get an error message saying unable to lock the data store and the playbook fails. Adding a retry logic eliminates these exception scenarios. In case of nso they have implemented 6 retries with a delay of 3.