ktbyers / netmiko

Multi-vendor library to simplify Paramiko SSH connections to network devices
MIT License
3.59k stars 1.3k forks source link

HP Procurve / Aruba - Random timeouts incorrect prompt #1378

Closed twripley closed 5 years ago

twripley commented 5 years ago

Hello,

Wanted to preface this "issue" by saying I'm very new to Python and github, so please let me know if I'm doing anything wrong.

Using Netmiko 2.4.2 on Windows Server 2016, Python 3.7.4. Connecting to about 250 Aruba 2530 switches running 16.06 and 16.08 code. About 10-15 devices will fail with a NetMikoTimeoutException. I'm attaching a debugging log for one device.

I notice the find_prompt() comes back with the prompt twice. The prompt for this device is "CTR57-2530-GB1#" but you'll see in the log that it see's it twice on the same line.

For some reason Netmiko also sends a "logout\n", then a "y" before sending any of my commands.

I can't explain it. Like I say, about 95% of my devices work fine, but a random 5% fail.

Any help is appreciated.

Thanks Tyler netmiko-debug.log

ktbyers commented 5 years ago

Can you post your code.

And the full exception stack of a failure.

twripley commented 5 years ago

Here is the code with passwords scrubbed. aruba-switch-report.txt

twripley commented 5 years ago

I'm trying to get the full exception stack. Can you let me know the best way to get that? I'm using threading and it doesn't give me the full exception stack when it fails.

I've added this line to my try/except rules:

except netmiko.ssh_exception.NetMikoTimeoutException as exc_error: console.error(f"Connection timed-out to: {this_switch}") console.error(f"Exception: {exc_error}")

All it prints is this: 22:03:53 - ERROR - Connection timed-out to: 10.11.101.15 22:03:53 - ERROR - Exception: Timed-out reading channel, data not available.

Is there a better way to get the full exception from my code?

ktbyers commented 5 years ago

Try setting global_delay_factor=2 (or 4).

This is an argument to ConnectHandler.

twripley commented 5 years ago

Hmmm. I was using a delay factor of 2. Just switched to 4 and all my devices completed successfully.

Is this a known issue with the Procurve devices? I don't see this issue on my Cisco routers.

carlmontanari commented 5 years ago

@furgussen I dont think its necessarily a "known issue" -- some devices are just slower, some connections are just slower and so playing with the delay factor is sometimes necessary to slow netmiko down enough to tolerate the latency/slowness. Glad its working, going to go ahead and close this out!