ganeshrn commented 6 years ago

Proposal:

Generic Netconf modules to support network platforms that are Netconf enabled Author: Ganesh B. Nalawade @ganeshrn

Date: 2018/04/11

Status: New
Proposal type: core design
Targeted Release: 2.6
Associated PR: <link to GH PR in ansible/proposals if PR was submitted>
Estimated time to implement: 4 weeks

Motivation

Describe the reasons for this proposal. Provide generic Netconf modules that can communicate with the network device that supports Netconf

Problems

What problems exist that this proposal will solve?

As a user, I want to connect to any Netconf enabled remote device and fetch configuration from running/ startup datastore based on device capability.
As a user, I want to connect to any Netconf enabled remote device and fetch operational state data
As a user, I want to connect to any Netconf enabled remote device and edit the configuration running/candidate datastore based on device capability.
As a user, I want to connect to any Netconf enabled remote device and execute any arbitrary Netconf RPC that is supported by remote device.
As a user, I want Netconf existing and new modules to work with connection: local and connection:netconf that uses underlying persistent connection framework.
Depecreate the credentials and connection specific options currently added in netconf_config module.

Solution proposal

netconf_get module: The module fetch configuration and state data from remote Netconf enabled host. Options: ~1. data_type: Identity which data needs to be fetched ie. configuration or state data. Valid values are config and state and default to config~
1. filter: This argument specifies the XML string which acts as a filter to restrict the portions of the data to be are retrieved from remote device. If this option is not specified entire configuration or state data is returned in result depending on the value of source option. The filter value can be either xml string or xpath, if the filter is in xpath format the Netconf server running on remote host should support xpath capabiltiy else it will result in error.
2. source: This argument specifies the datastore from which configuration data should be fetched. Valid values are running, candidate and auto. If the value is auto it fetched configuration data from I(candidate) datastore and if candidate datastore is not supported it fallback to running datastore. If the source value is not mentioned in that case both configuration and state information in returned in response from running datastore choices: ['running', 'candidate', 'auto', 'startup']. ~3. filter_type: This argument specifies the type of filter, valid values are subtree and xpath. If value is subtree the value of option filter should be a xml string, if the value is xpath the value of option filter should be a xml xpath. The value xpath is supported only if Netconf server running on remote host supports xpath capability. choices: ['subtree', 'xpath'] default: subtree~
3. display: Encoding scheme to use when serializing output from the device. Currently supported option value is json only. If the option value is I(json) it requires jxmlease to be installed on control node. choices: ['json']
netconf_config module: The module will be used to edit the configuration, copy the configuration from one datastore to another or delete a complete configuration datastore on remote Netconf enabled host. Options:
1. content: It is XML string with a hierarchy of configuration data as defined by the device's data models. aliases: xml. The xml string must be rooted in the config element. Use lookup plugins for loading content from disk file.
2. target: Name of the configuration datastore to be edited. Valid values are auto, candidate, running, startup or remote URL. auto, uses candidate and fallback to running.
3. default_operation: The default operation that needs to be performed on candidate datastore. Valid values are merge, replace and none. The default value is merge. merge: If the value is merge the configuration data in the config option is merged with the configuration at the corresponding level in the target datastore. If the value is replace the configuration data in the config option completely replaces the configuration in the target datastore. If the value is none the target datastore is unaffected by the configuration in the config option, unless and until the incoming configuration data uses the "operation" attribute to request a different operation.
4. source: It is the name of the configuration datastore to use as the source to copy the configuration when copy option is set to True. The values can be either running, candidate, startup or a remote URL.
5. confirm: This argument will configure a timeout value for the commit to be confirmed before it is automatically rolled back. If the confirm_commit argument is set to False, this argument is silently ignored. If the value of this argument is set to 0, the commit is confirmed immediately. The remote host should support :candidate and :confirmed-commit capability for this option to work properly.
6. confirm_commit: This argument will execute commit operation on the remote device. It can be used to confirm a previous commit. This is a boolean value.
7. backup: This argument will cause the module to create a full backup of the current datastore value in target from the remote device before any changes are made. The backup file is written to the backup folder in the playbook root directory. If the directory does not exist, it is created.
8. copy: It instructs the module to copy the configuration from datastore value mentioned in source option to datastore value mentioned in target. It is a boolean value
9. save: It instructs the module to save the running-config to the startup-config if changed.
10. delete: It instructs the module to delete the configuration from value mentioned in target datastore.
netconf_rpc module: This module will execute generic Netconf RPC that is defined by Netconf standard as well as propriety RPC based on the Netconf capabilities supported by the remote host. Options:
1. rpc: This is the name of netconf RPC that should be executed of remote Netconf enabled device. The Netconf RPC defined by protocol can be lock, unlock, delete-config, copy-config, get, get-config, close-session, kill-session, get-config, edit-config along with bunch of propriety RPC supported by remote host that are specific to vendors.
2. content: This is an XML string that identifies the payload for the rpc and is specific to the given rpc value defined in netconf standard or can be specific to the vendor supported rpc's.

- name: Get confgiuration and state data
   netconf_get:

- name: Get configuration data from candidate datastore state
   netconf_get:
     source: candidate

- name: Get system configuration data from running datastore state
   netconf_get:
      source: running
      filter: <configuration><system></system></configuration>

- name: Get confgiuration and state data in json format
   netconf_get:
     display: json

- name: Get system confgiuration information using xpath filter
   netconf_get:
     source: running
     display: json
     filter: configuration/system

- name: Take backup of current configuration and edit configuration on target datastore
  netconf_config:
    content:  |
        <config xmlns:xc="urn:ietf:params:xml:ns:netconf:base:1.0">
            <system xmlns="urn:ietf:params:xml:ns:yang:ietf-system">
                <ntp>
                    <enabled>true</enabled>
                    <server>
                        <name>ntp1</name>
                        <udp><address>127.0.0.1</address></udp>
                    </server>
                </ntp>
            </system>
        </config>
    target: running
    confirm: 0
    backup: True

- name: confirm a previous commit
  netconf_config:
    confirm_commit: True

- name: Copy configuration in running datastore to the remote host
   netconf_config: 
     target: https://user:password@example.com/cfg/new.txt
     source: running
     copy: True

- name: Execute lock on target datastore
  netconf_rpc:
    rpc: lock
    content: <target><candidate/></target>

- name: Execute platform specifc RPC 
  netconf_rpc:
     rpc: get-interface-information

Testing (optional)

Add integration and unit test for netconf_config module.
Add integration and unit test for netconf_get module.
Add integration and unit test for netconf_rpc module.

Documentation (optional)

Update module documentation for netconf_config.
Add module documentation for netconf_get and netconf_rpc.

privateip commented 6 years ago

Updating here as well. Mentioned in IRC meeting that, imho, the following changes should be made:

config becomes content and drop src argument - use lookup plugins for loading content from disk data becomes content

in both cases, this better aligns the module arguments with names across they ansible ecosystem

lpenz commented 6 years ago

Maybe we could also have a command for every standard netconf rpc besides the generic one.

For instance, in addition to

- name: Execute lock on target datastore
  netconf_rpc:
    rpc: lock
   data: <target><candidate/></target>

have something like:

- name: Execute lock on target datastore
  netconf_rpc_lock:
    target: candidate

There might even be additional value, as they can be smarter with regards to idempotency. In this case, the specific command can avoid locking candidate twice, while netconf_rpc can just be a passthru.

ganeshrn commented 6 years ago

@lpenz That's a good idea. There can be netconf_lockmodule.

netconf_lock module: The module will be used to lock/unlock the target datastore. Options:

target: Name of the configuration datastore to be locked or unlocked. Valid values are auto, running, startup or remote URL. auto, uses candidate and fallback to running. Default value is auto.
lock: This is a boolean value and identifies if the target datastore should be locked or unlocked. If the value is True, datastore is locked if not locked previously. If the value is False target datastore is unlocked if previously not unlocked. The default value is True.

- name: lock on candidate datastore
  netconf_lock:
    target: candidate

- name: unlock on candidate datastore
  netconf_lock:
    target: candidate
    lock: False

All the other standard Netconf operations can be executed using netconf_get and netconf_config.

Thoughts/suggestions?

ganeshrn commented 6 years ago

@privateip Update proposal incorporating your review comments.

privateip commented 6 years ago

@lpenz @ganeshrn I worry we are exposing way to much of the netconf api here. The playbook writer would need intimate knowledge of netconf if we externalize the locking mechanism into a separate module as opposed to building it into the module on an as needed basis. That is a bit of an anti-pattern for Ansible modules

ganeshrn commented 6 years ago

@privateip That makes sense, netconf_config have the lock/unlock mechanism build into the module. So, for now, will go ahead with netconf_config, netconf_get and netconf_rpc modules. If there is a valid use case for using lock/unlock independently we can revisit this later.

ganeshrn commented 6 years ago

Moving to agreed as discussed in irc meeting https://meetbot.fedoraproject.org/ansible-network/2018-04-25/network_working_group.2018-04-25-16.00.log.html

wisotzky commented 6 years ago

General remark: I think there is a dilemma here which needs to be solved. Depending on the actual use-case, it absolutely makes sense to expose NETCONF functionality to the playbook author. Having a netconf_lock module is a reasonable good example for this - and I would claim the same to be true about committing changes.

Example Scenario A playbook author wants to change the port configuration, for example enabling EFM OAM and LLDP on all NNI ports network-wide. To do this, typically you would first read-out the list of NNI ports using netconf_get and afterwards you would apply the change using netconf_config. In the logic of NETCONF (rfc6241) you want to make sure, that nobody changes the datastore from the read to the edit/commit action. For this, the lock must be taken before the read and released after committing the changes.

Current implementation: lock > get-config > unlock > lock > edit-config > commit > unlock

Desired implementation: lock > get-config > edit-config > commit > unlock

While it was impossible to do this using the original approach connection=local, with connection=NETCONF the connection is now persisted over multiple NETCONF related tasks and so can the lock.

Conclusion would be, that the playbook author might decide either to use explicit lock using a new netconf_lock module or implicit lock embedded in the (existing) netconf_get and netconf_config modules.

For those netconf related modules (get/config) it would basically mean, that we either need to expose an explicit "lock" attribute (default: True) - to prevent the module to take/release the lock (by setting lock: False). The other option would be, to check at the beginning of the playbook if the datastore is already locked by this netconf client - and to not do it again. Most flexible solution is likely a mix of both approaches, having an option lock with values True, False and Auto (default: auto).

wisotzky commented 6 years ago

Comments on proposed changes on netconf_config

Unclear, why mixing copy-config, delete-config and edit-config in the same module
Even it is theoretically allowed to use candidate as source for copy-config, I am struggling to understand use-cases for this. Instead of copying candidate to running, commit must be used.
We are missing the case, where for copy-config the config element is used as source element. In this case the complete configuration is replaced by the content of the config element.
If it is decided to mix edit-config and copy-config in the same module, why do we needed explicitly to set the copy option? Wouldn't it be enough defining the source attribute?
The current implementation of the save option should be changed. Proposal would be, to rename the option to persist - which actually means to persist configuration changes over reboots. By default, persist=True which means if startup is supported, copy the running datastore to startup after successful configuration change (only if changes included). In case startup is not supported, we assume it is persisted automatically. If persist=False, we would suppress saving the configuration to startup.
The backup option should have similar flexibility. It should only create a backup, if the playbook results in changing the target (=running) datastore
It would be wishful, if the edit-config would not just return okay - but also the list of changes which actually have been applied (created, updated, deleted attributes)
It might be wishful, having a dryrun option - which just pushes the configuration to candidate and applies validation and just returns the list of items to create/update/delete.

Also, as discussed in my previous comment, it might be beneficial to provide more control over discard-changes/commit. This would allow to serialize multiple tasks, having more efficient use of commit operation (typically expensive operation in most router OS implementations).

wisotzky commented 6 years ago

Request to add a new "netconf_audit" module

Functionality should be in-line with the "nso_verify" module. Technically, there is no need to have Cisco NSO for this. Question is, if the module needs to be YANG aware. Likely the advantage of being YANG-aware would be, that we could properly render data-types integer/string and we could distinguish better between list-keys and regular leaf attributes.

lpenz commented 6 years ago

We do have this tension between ansible's idempotency principle and exposing more of netconf's functionality. It might be worth making a distinction between "porcelain" modules that are more aligned with ansible and modules that allow access to lower level functionality. Maybe just adding a note that netconf_rpc and company are for advanced users is enough.

wisotzky commented 6 years ago

@lpenz @ganeshrn @privateip

Please have a check on #40198 as this also includes now support for netconf_rpc.

In my view, we don't need any other specific modules for tasks like lock, unlock, discard, commit, delete-config, copy-config, validate, kill-session, close-session and get-schema anymore. The new netconf_rpc module can easily be used for all of that cases, so having dedicated modules for those cases, might not really add value.

So the main remaining tasks are:

Improve documentation, especially by providing good examples
Update the netconf_config modules (in my eyes, should be renamed to netconf_set)
Adding missing integation/unit tests

Also per my comment from May 9th, we should work on a validation/audit function (similar to the Cisco NSO modules). The main idea is, to check config/state attributes for existence and values. This could be used for tasks like security audits, configuration resync audits and pre/post upgrade/migration validation.

ganeshrn commented 6 years ago

Implemented by: https://github.com/ansible/ansible/pull/44379 https://github.com/ansible/ansible/pull/39869 https://github.com/ansible/ansible/pull/40358

wisotzky commented 6 years ago

@ganeshrn, I've just done a quick review on PR44379 which contains the new implementation for netconf_config. I need to say, that my comment from May 9 is still absolutely valid. I do agree to @lpenz, that Ansible modules are not just to be built around APIs to expose all it functionality.

If the target of Ansible modules is to provide access in an abstracted way, making smart decisions and combining multiple API calls with reason - I don't see this to be happen with the new version of the netconf_config module. It is rather the opposite, providing full access to all the details of NETCONF. Combining all the 3 functions (edit-config, copy and delete) does not make much sense - as those operations are mutually exclusive. Therefore effort must be taken to keep the use-cases disjoint. Also people might struggle a lot with the fact, that we are not using exactly the same attribute names as in the NETCONF specification (rfc6241).

Having this said, combining some NETCONF RPCs makes sense: lock - edit-config running - copy running > startup - unlock lock - edit-config candidate - commit - copy running > startup - unlock lock - edit-config candidate - validate - discard - unlock

In those scenarios lock/unlock and copy running>startup are optional.

I am not convinced that such advanced use-cases exist for copy-config or delete. I've not really seen solutions, which execute (un)lock for executing the copy-config or delete operation.

At least as playbook author I would be confused about all the options provided in the new netconf_config module. So they must understand rfc6241 as a foundation and than our take on this. This is not an easy ask - so I would consider the changes made as not enough.

ganeshrn commented 6 years ago

Combining all the 3 functions (edit-config, copy and delete) does not make much sense - as those operations are mutually exclusive. Therefore effort must be taken to keep the use-cases disjoint.

This 3 opertaion are mutually exclusive in modules as well and handled it using mutually_exclusive clause within the module

Also people might struggle a lot with the fact, that we are not using exactly the same attribute names as in the NETCONF specification (rfc6241)

Which option are you referring to?

Having this said, combining some NETCONF RPCs makes sense: lock - edit-config running - copy running > startup - unlock lock - edit-config candidate - commit - copy running > startup - unlock lock - edit-config candidate - validate - discard - unlock

All these scenarios can be achieved using the existing module.

vparames86 commented 5 years ago

Currently in Ansible 2.7, There is no retry logic for datastore lock. In case of Cisco IOS-XE, There is a local synchronization process in the router which locks the datastore temporarily and we get an error message saying unable to lock the data store and the playbook fails. Adding a retry logic eliminates these exception scenarios. In case of nso they have implemented 6 retries with a delay of 3.

ansible / proposals

Generic Netconf modules for managing Netconf enabled network devices #104

Proposal:

Motivation

Problems

Solution proposal

Testing (optional)

Documentation (optional)