Open TheBirdsNest opened 3 years ago
P.S - the reason for us enabling keep_alive:False
is due to a Cisco CVE where the implementation of SSH Global Requests is causing a FSM exception and crashing devices.
I believe I have found the issue in salt.utilts.napalm package..
I can do a PR to the SaltStack repo if thats the right place for it?
>>> import napalm.base as napalm_base
>>> from napalm_base.exceptions import ConnectionClosedException
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'napalm_base'
>>>
>>>
>>>
>>> from napalm.base.exceptions import ConnectionClosedException
It appears naplam_base is never loaded (correctly?) so the HAS_CONN_CLOSED_EXC_CLASS
which causes this reconnect clause to be ignored in the call()
function:
elif retry and HAS_CONN_CLOSED_EXC_CLASS and isinstance(error, ConnectionClosedException):
Tested my manually importing:
from napalm.base.exceptions import ConnectionClosedException
and its working well now:
/run # salt -I 'realm:WSC' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
C891-24X/K9
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run # salt -I 'realm:WSC' saltutil.refresh_grains
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
True
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run # salt -I 'realm:WSC' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
C891-24X/K9
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run #
I created a related bug in the SaltStack project as I guess that is the right place for it: https://github.com/saltstack/salt/issues/60581 Feel free to close this one if needed otherwise I will on its resolution in the SaltStack repo.
I'm aiming on completing a PR for this.
Opening this as I believe it to be a bug/undesirable state.
When refresh grains or any sync activity on the proxy (
saltutil.sync_all
) grains populated by Napalm are reset to 'None' when the connection to the device has closed. This occurs when the proxy settingkeep_alive:False
is applied.Instead, what should happen is if a grains refresh is requested, the connection should be restarted and grains re-collected.
proxy.sls:
Workflow:
Proxy log:
Please note I applied the fix for the bug noted here: https://github.com/saltstack/salt/issues/60025
Versions Report:
Note: This is a lab setup but the same is seen in our production setup with the same version but running on CentOS.
Many thanks!