napalm-automation / napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Apache License 2.0
2.26k stars 554 forks source link

ios.py's commit behavior (fishing out lines / only applies differences) #1059

Open xtxerr opened 5 years ago

xtxerr commented 5 years ago

Hi,

I would like to understand some design decision making behind ios.py:

I do wonder about the use of "show archive config incremental-diffs" to mimic a commit operation in config_merge, which ends up in applying only the differences between the config supplied by the user and what already exists in running config. Precisely the code in https://github.com/napalm-automation/napalm/blob/develop/napalm/ios/ios.py#L406 fishs out supplied configuration lines.

Consider the following in a running config: "ntp server 192.0.2.1"

When rolling out this same ntp server to multiple devices now, where some have this config already and some others do not, then you could just roll out the following snippet to get every device to the same state/configuration: "no ntp ntp 192.0.2.1"

These exemplary two lines of configuration could deal with a heterogenous state in the network, where different ntp settings are applied and are also unknown and one would just like to get everything to the same configuration state.

But hat will happen now is that a device that has already "ntp 192.0.2.1" in it's running config will lose it after the commit, because only "no ntp" is going to be applied (same thing will happen with anything which needs to be removed and applied again eg. router xxx, access-list, etc.).

To deal with this behavior one could try to find out all occurrences of a configuration line on a device which shouldn't exist after a merge operation and remove them individually. Not only is this very hard to do, it's also unintuitively being forced to not use block based configuration abilities.

In another case where you need to remove a whole configuration block or a line and afterwards you need to re-apply it (think of removing a whole ACL and then adding it again with different content) then I cannot think of a way of making this work other than applying the commands in multiple operational steps. which leads to unnecessary long downtimes of the configuration you are working on (between the iterative operations of scp and copy - single transaction with ios.py is at least somewhere around ~15s).

My ultimate question is now: is it really the desired behavior that ios.py interfere with the configuration the user submits by fishing out lines already existing somewhere in running config?

ktbyers commented 5 years ago

@jesk78 If I am understanding you correctly,you are misunderstanding the behavior of NAPALM + IOS and a merge operation.

NAPALM + IOS does the following

  1. secure copy a configuration file you created and transfer that to the remote device (onto flash:/bootflash:). The file name will be named: merge_config.txt. This will also be the case for configurations, you do inline in NAPALM (i.e. a /tmp file will be created on your NAPALM control system and that file will be transferred via SCP).

  2. On commit_conifg() operation, the merge_config.txt is merged into running config using copy flash:/merge_config.txt running-config. It may or may not be flash:, but you get the idea.

  3. The line you referenced above show archive config incremental-diffs is only used for generating a best-effort diff (and it realistically is not very good). The configuration that is applied is what is specified in your merge configuration changes.

Note, many of the issues you mentioned above go away if using a full-configuration replace. This includes problems with the quality of the diff.

Let me know if I misunderstood what you were asking--though.

xtxerr commented 5 years ago

I really thought it was the concept, kinda got stucked in looking into this - you are right it's just used for actual diff.

The issue seems to be that IOS is the guilty one filtering out the commands already existing when doing the "copy flash:merge.conf running-config" operation... weird

So the behavior is still the same, it's just not ios.py's fault.

ktbyers commented 5 years ago

@jesk78 Meaning the issue is with the diff or you are seeing an actual issue on a merge behavior?

If it is on the merge behavior, can you give me an example?

I don't think IOS does any filtering here--it merges what you tell it to merge.

xtxerr commented 5 years ago

Take my ntp example, this happens on Everest and Fuji. W/o using Napalm can be tested manually:

Here on ASR902 with above stated releases (16.5.1 and 16.9) the configuration will end up with no ntp configuration at all.

ktbyers commented 5 years ago

I am not seeing an issue:

Cisco IOS XE Software, Version 16.08.01
Cisco IOS Software [Fuji], ISR Software (ARMV8EB_LINUX_IOSD-UNIVERSALK9_IAS-M), Version 16.8.1, RELEASE SOFTWARE (fc3)
device.load_merge_candidate(config="ntp server 192.0.2.1")
ipdb> p device.compare_config()                                                                                                           
'+ntp server 192.0.2.1'
ipdb> !device.commit_config()   

End state:

cisco3#show run | inc ntp
ntp server 130.126.24.24
ntp server 152.2.21.1
ntp server 192.0.2.1
xtxerr commented 5 years ago

You miss the „no ntp“ before I guess.

ktbyers commented 5 years ago

Okay, I don't think that is Cisco IOS-XE being smart. I think it is probably just a Cisco bug.

I also would not generalize it beyond this one example (unless you have data from other cases).

cisco3#more flash:/merge_config.txt
no ntp
ntp server 192.0.2.1
cisco3#show run | inc ntp                         
ntp server 192.0.2.1

cisco3#copy flash:/merge_config.txt running-config
27 bytes copied in 0.008 secs (3375 bytes/sec)

# Should be 'ntp server 192.0.2.1' at this point
cisco3#show run | inc ntp                         

# Behaves differently the second time you run it (at this point there is no ntp in the config)
cisco3#copy flash:/merge_config.txt running-config
27 bytes copied in 0.028 secs (964 bytes/sec)

cisco3#show run | inc ntp                         
ntp server 192.0.2.1 
network-shark commented 3 years ago

I can confirm this behaviour on 16.09.06

Not sure if I don't get it , but people are doing changes like this all the time ?

@jesk78 Did you created a bug report @cisco ? I did not expect that... . What do you think @ktbyers ?

csr1000v-lab#tclsh
csr1000v-lab(tcl)#puts [open "flash:no_ntp.txt" w+] {
+>no ntp
+>ntp server 192.0.2.1
+>}
csr1000v-lab(tcl)#tclquit
csr1000v-lab#more fla
csr1000v-lab#more flash:no
csr1000v-lab#more flash:no_ntp.txt
no ntp
ntp server 192.0.2.1

csr1000v-lab#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
csr1000v-lab(config)#ntp server 192.0.1.1
csr1000v-lab(config)#
csr1000v-lab#sh run | inc ntp
ntp server 192.0.1.1

csr1000v-lab#copr
csr1000v-lab#copy
csr1000v-lab#copy fla
csr1000v-lab#copy flash:no
csr1000v-lab#copy flash:no_ntp.txt run
csr1000v-lab#copy flash:no_ntp.txt running-config
Destination filename [running-config]?
30 bytes copied in 0.005 secs (6000 bytes/sec)
csr1000v-lab#sh run | inc ntp

### NTP is missing 

csr1000v-lab#copy flash:no_ntp.txt running-config
Destination filename [running-config]?
30 bytes copied in 0.007 secs (4286 bytes/sec)
csr1000v-lab#sh run | inc ntp
ntp server 192.0.2.1

### Magically it is back 

csr1000v-lab#show vers
csr1000v-lab#show version | inc 16.09.
Cisco IOS XE Software, Version 16.09.06

Workaround :

csr1000v-lab(tcl)#puts [open "flash:remove_old_ntp_with_ip.txt" w+] {
+>no ntp server 192.0.1.1
+>ntp server 192.0.2.1
+>}
csr1000v-lab(tcl)#tclquit
csr1000v-lab#more flash:remove_old_ntp_with_ip.txt
no ntp server 192.0.1.1
ntp server 192.0.2.1

csr1000v-lab#sh run | inc ntp
csr1000v-lab#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
csr1000v-lab(config)#ntp server 192.0.1.1
csr1000v-lab(config)#
csr1000v-lab#sh run | inc ntp
ntp server 192.0.1.1
csr1000v-lab#}
csr1000v-lab#cop
csr1000v-lab#copr
csr1000v-lab#copy fl
csr1000v-lab#copy flash:remo
csr1000v-lab#copy flash:remove_old_ntp_with_ip.txt run
csr1000v-lab#copy flash:remove_old_ntp_with_ip.txt running-config
Destination filename [running-config]?
47 bytes copied in 0.004 secs (11750 bytes/sec)
csr1000v-lab#sh run | inc ntp
ntp server 192.0.2.1