nornir-automation / nornir

Pluggable multi-threaded framework with inventory management to help operate collections of devices
https://nornir.readthedocs.io/
Apache License 2.0
1.4k stars 237 forks source link

Getting Timeout while using netmiko_file_transfer #707

Closed sindhujit1 closed 3 years ago

sindhujit1 commented 3 years ago

I am getting timeout while transferring an image to a arista device.

netmiko_file_transfer Timeout waiting for scp response

Any idea how we can tweak this code to wait for a longer timeout ?

result = nr_arista.run(
        task=netmiko_file_transfer,
        source_file=source_file,
        dest_file=BinFile,
        direction='put',
        num_workers=len(NodeSwitch),
        )
ktbyers commented 3 years ago

@sindhujit1 Please post the full exception stack trace.

sindhujit1 commented 3 years ago

Here it is :


Traceback (most recent call last):
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/paramiko/channel.py", line 699, in recv
    out = self.in_buffer.read(nbytes, self.timeout)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/paramiko/buffered_pipe.py", line 164, in read
    raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 356, in _recv_confirm
    msg = self.channel.recv(512)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/paramiko/channel.py", line 701, in recv
    raise socket.timeout()
socket.timeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/nornir/core/task.py", line 67, in start
    r = self.task(self, **self.params)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/nornir/plugins/tasks/networking/netmiko_file_transfer.py", line 28, in netmiko_file_transfer
    net_connect, source_file=source_file, dest_file=dest_file, **kwargs
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/netmiko/scp_functions.py", line 104, in file_transfer
    verifyspace_and_transferfile(scp_transfer)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/netmiko/scp_functions.py", line 20, in verifyspace_and_transferfile
    scp_transfer.transfer_file()
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/netmiko/scp_handler.py", line 304, in transfer_file
    self.put_file()
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/netmiko/scp_handler.py", line 317, in put_file
    self.scp_conn.scp_transfer_file(self.source_file, destination)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/netmiko/scp_handler.py", line 40, in scp_transfer_file
    self.scp_client.put(source_file, dest_file)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 166, in put
    self._send_files(files)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 271, in _send_files
    self._send_file(fl, name, mode, size)
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 297, in _send_file
    self._recv_confirm()
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 358, in _recv_confirm
    raise SCPException('Timeout waiting for scp response')
scp.SCPException: Timeout waiting for scp response
sindhujit1 commented 3 years ago

I saw this post https://github.com/ktbyers/netmiko/issues/1254 and added the following to python3.6/site-packages/scp.py:

with SCPClient(transport,socket_timeout=40) as client:

But that did not work.

sindhujit1 commented 3 years ago

The binary file is EOS-4.25.4M.swi and the size is almost close to 1GB

ktbyers commented 3 years ago

@sindhujit1 I assume your VTY timeouts have been set to 0 (so the SSH/SCP session never expires)?

sindhujit1 commented 3 years ago

Well we can copy the file fine when I try to use winscp and directly place the file under /mnt/flash for the arista devices.

idle-timeout is set as 15 in our configs.

ktbyers commented 3 years ago

I would set it to zero (i.e. never expire) and re-test it.

Netmiko's error is saying it was expecting remote data and it never came.

sindhujit1 commented 3 years ago

I don not think its a vty issue. The image is copying fine I think, but I am getting that error at the end of file transfer. It seems like the socket is not being closed at the end of the session I need to somehow get a reference to the socket object and force close it.

ktbyers commented 3 years ago

@sindhujit1 I am not seeing that based on the exception output i.e. Netmiko's error is this:

  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 297, in _send_file
    self._recv_confirm()
  File "/apps/netadc3/.venvs3/netadc3/lib64/python3.6/site-packages/scp.py", line 358, in _recv_confirm
    raise SCPException('Timeout waiting for scp response')
scp.SCPException: Timeout waiting for scp response

I am going to go ahead and close this...no point in you and I arguing on the troubleshooting process.

sindhujit1 commented 3 years ago

we cannot keep the vty as zero because its a security issue.

I did find an issue with the netmiko library.

I added the socket_timeout value to netmiko/scp_handler.py and it worked

self.scp_client = scp.SCPClient(self.scp_conn.get_transport(),socket_timeout=180)

Advise you to please raise a PR and fix this.

ktbyers commented 3 years ago

But then you said adjusting the socket_timeout didn't work (when you tried it earlier)?

ktbyers commented 3 years ago

Can you post the full working solution (i.e. the entire set of code that works properly)?

ktbyers commented 3 years ago

Also please take the code out of Nornir and use straight Netmiko testing (i.e. we should debug Nornir and Nornir plugin issues separately from a Netmiko issue).

sindhujit1 commented 3 years ago

I think I was adding the socket_timeout in the wrong place initially. adding it in the right place fixed it

ktbyers commented 3 years ago

Okay, so you have a working fix then using socket_timeout?

Note, socket_timeout is an argument to netmiko_file_transfer so you should be able to just pass it (i.e. you shouldn't need to change the Netmiko library to adjust this).

sindhujit1 commented 3 years ago

Well everything is same , I just use this for file transfer :

nr = InitNornir(config_file=project_path +"/arista/config.yaml",logging={"enabled":False},core={"raise_on_error": True})
nr_arista = nr.filter(F(hostname__in=device_urlArray))
nornir_set_creds(nr_arista,fabricList,device_urlArray)
#print(NodeSwitch_tmp,"NodeSwitch_tmp")
source_file = project_path + "/media/" + BinFile
result = nr_arista.run(
task=netmiko_file_transfer,
source_file=source_file,
dest_file=BinFile,
direction='put',
num_workers=len(NodeSwitch_tmp),
    )

And the only change I made is under netmiko/scp_handler.py:

self.scp_client = scp.SCPClient(self.scp_conn.get_transport(),socket_timeout=180)

ktbyers commented 3 years ago

You should be able to:

                    result = nr_arista.run(
                        task=netmiko_file_transfer,
                        source_file=source_file,
                        dest_file=BinFile,
                        direction='put',
                        num_workers=len(NodeSwitch_tmp),
                                                  socket_timeout=180,
                    )
ktbyers commented 3 years ago

Also which version of Nornir are you using?

sindhujit1 commented 3 years ago

nornir==2.2.0

I don't believe the socket_timeout was added to that.

** netmiko_file_transfer file_transfer() got an unexpected keyword argument 'socket_timeout'**

ktbyers commented 3 years ago

Nornir 2.x is no more (i.e. any fix would be in Nornir 3.x and its associated plugins).

You would need to use Nornir 3.x and you should be using Netmiko 3.4.0. I checked in the nornir-netmiko plugin for Nornir 3.x and the argument should pass through.

Regards, Kirk

sindhujit1 commented 3 years ago

I get an error now after upgrading to nornir 3.1

:init() got an unexpected keyword argument 'num_workers'

ktbyers commented 3 years ago

Nornir 3.x is backwards incompatible with Nornir 2.x, there are a set of things that changed:

https://github.com/twin-bridges/nornir_course/blob/master/nornir3_changes.md#runners-and-num_workers-changes

sindhujit1 commented 3 years ago

After installing the different dependencies and adding the socket_timeout , I am not getting that error anymore in nornir 3!

Thank you sir !

ktbyers commented 3 years ago

@sindhujit1 Okay, great :-)