ktbyers / netmiko

Multi-vendor library to simplify Paramiko SSH connections to network devices
MIT License
3.63k stars 1.31k forks source link

Disconnecting from device takes way too long. #3436

Open nikuizzz opened 6 months ago

nikuizzz commented 6 months ago

Description of Issue/Question

Hello.

I'm using netmiko to do some network automation on Extreme ERS devices. As my fleet of equipment is made out of more than 300 switches, reducing the execution time is crucial to achieve a somewhat acceptable user experience.

We can separate my code in 3 parts, the first one being the connection to the switch, the second one is sending commands and the third one is disconnecting.

The connection part takes less than 2 seconds. The part where I send commands takes a bit more time ( also depending on the number of commands ) but generally varies between 2-6 seconds. The problem I have is the disconnecting part : sometimes it can take up to 7 seconds to disconnect, which is quite time-consuming.

I'm using the built-in disconnect function to close the connection. Is there a way to reduce the time it takes to kill the SSH session? I added some timers to the logs shown below so that you can see what I'm talking about.

Thank you in advance for your help!

Setup

Netmiko version

netmiko==4.3.0

Netmiko device_type

avaya_ers

Steps to Reproduce the Issue

Netmiko logs

2024-05-22 10:59:36,739 - INFO - DISCONNECTING
2024-05-22 10:59:36,739 - DEBUG - write_channel: b'\n'
2024-05-22 10:59:36,839 - DEBUG - read_channel: 

2024-05-22 10:59:36,940 - DEBUG - read_channel: 3626GTS-PWR+(config)#
2024-05-22 10:59:37,040 - DEBUG - read_channel: 
2024-05-22 10:59:39,040 - DEBUG - read_channel: 
2024-05-22 10:59:39,041 - DEBUG - write_channel: b'\n'
2024-05-22 10:59:39,141 - DEBUG - read_channel: 

2024-05-22 10:59:39,241 - DEBUG - read_channel: 3626GTS-PWR+(config)#
2024-05-22 10:59:39,341 - DEBUG - read_channel: 
2024-05-22 10:59:41,342 - DEBUG - read_channel: 
2024-05-22 10:59:41,342 - DEBUG - write_channel: b'end\n'
2024-05-22 10:59:41,342 - DEBUG - read_channel: 
2024-05-22 10:59:41,353 - DEBUG - read_channel: 
2024-05-22 10:59:41,363 - DEBUG - read_channel: 
2024-05-22 10:59:41,373 - DEBUG - read_channel: 
2024-05-22 10:59:41,384 - DEBUG - read_channel: 
2024-05-22 10:59:41,394 - DEBUG - read_channel: end

2024-05-22 10:59:41,394 - DEBUG - Pattern found: (end) end
2024-05-22 10:59:41,394 - DEBUG - read_channel: 
2024-05-22 10:59:41,404 - DEBUG - read_channel: 
2024-05-22 10:59:41,415 - DEBUG - read_channel: 
2024-05-22 10:59:41,425 - DEBUG - read_channel: 
2024-05-22 10:59:41,435 - DEBUG - read_channel: 
2024-05-22 10:59:41,446 - DEBUG - read_channel: 
2024-05-22 10:59:41,456 - DEBUG - read_channel: 
2024-05-22 10:59:41,466 - DEBUG - read_channel: 
2024-05-22 10:59:41,477 - DEBUG - read_channel: 
2024-05-22 10:59:41,487 - DEBUG - read_channel: 
2024-05-22 10:59:41,497 - DEBUG - read_channel: 
2024-05-22 10:59:41,508 - DEBUG - read_channel: 
2024-05-22 10:59:41,518 - DEBUG - read_channel: 3626GTS-PWR+#
2024-05-22 10:59:41,518 - DEBUG - Pattern found: (#.*) 
3626GTS-PWR+#
2024-05-22 10:59:41,519 - DEBUG - write_channel: b'\n'
2024-05-22 10:59:41,619 - DEBUG - read_channel: 

2024-05-22 10:59:41,719 - DEBUG - read_channel: 3626GTS-PWR+#
2024-05-22 10:59:41,820 - DEBUG - read_channel: 
2024-05-22 10:59:43,820 - DEBUG - read_channel: 
2024-05-22 10:59:43,820 - DEBUG - exit_config_mode: end
3626GTS-PWR+#
2024-05-22 10:59:43,820 - DEBUG - write_channel: b'exit\n'
2024-05-22 10:59:43,820 - INFO - DISCONNECTING TIME - 7.1

Relevant Python code

# disconnecting
tmp_start = time.time()
_log("DISCONNECTING")
ers.disconnect()
_log(f"DISCONNECTING TIME - {round(time.time() - tmp_start, 1)}")
ktbyers commented 6 months ago

You probably would need to look at the extreme_ers code and see if there are ways to improve the disconnect process.

From your above, it looks like you are disconnecting from config-mode, I would probably exit config mode first and then disconnect.

Also, I assume you are using threads or Nornir (which uses threads). Having 300 devices shouldn't really matter that much.

nikuizzz commented 6 months ago

You probably would need to look at the extreme_ers code and see if there are ways to improve the disconnect process.

From your above, it looks like you are disconnecting from config-mode, I would probably exit config mode first and then disconnect.

Also, I assume you are using threads or Nornir (which uses threads). Having 300 devices shouldn't really matter that much.

Indeed, the fact that I was trying to close the session while being in config mode was adding a few extra seconds to the execution. I managed to reduce it to 3-4 seconds, which still seems to be pretty long.

While exploring the source code of the extreme_ers module and more generally cisco_base_connection and base_connection ( modules that extreme_ers is based on ) I noticed that there is a SSH clean_up function that is called everytime before closing the connection. It seems to be that this is the part which takes a few extra seconds.

So, my question is : what is the purpose of a clean_up function? Would replacing it by a simple send_command("exit") work to close the session or is there any inconvinients in doing so?

Thank you for your reply!

ktbyers commented 6 months ago

cleanup() is what actually does the exit of the SSH and possibly the exit from config_mode.

If it is a Cisco IOS like device, you generally can't just do exit as if you are in config-mode, then exit will just exit you from config mode (or exit you from a sub-config mode like "interface config mode").

In other words, a single exit might not terminate your SSH session.