napalm-automation-community / napalm-aos

NAPALM driver for Alcatel Lucent Enterprise AOS
Apache License 2.0
11 stars 13 forks source link

candidate.cfg is not sufficiently unique if napalm-aos is used in parallel #13

Closed cs-1 closed 6 years ago

cs-1 commented 6 years ago

I use napalm-aos in napalm-ansible. Normally, Ansible will run processes in parallel. However, napalm-aos always uses "candidate.cfg" as default candidate configuration in https://github.com/napalm-automation-community/napalm-aos/blob/1fdb92ee0ee35b272c9a4c0999ca1f2873f10ac3/napalm_aos/aos.py#L92 This name is used e.g. in load_merge_candidate(). When load_merge_candidate is called by two processes running in parallel, using the same "candidate.cfg" name will break the load_merge_candidate process.

I suggest making the default "candidate.cfg" more unique. You already create sufficiently unique filenames in https://github.com/napalm-automation-community/napalm-aos/blob/1fdb92ee0ee35b272c9a4c0999ca1f2873f10ac3/napalm_aos/aos.py#L234 however, since you set "candidate.cfg" in https://github.com/napalm-automation-community/napalm-aos/blob/1fdb92ee0ee35b272c9a4c0999ca1f2873f10ac3/napalm_aos/aos.py#L92 that's the one that is used. One patch would be to replace that line

self.candidate_cfg_file = optional_args.get('candidate_cfg_file', 'candidate.cfg')

with

self.candidate_cfg_file = optional_args.get('candidate_cfg_file', str(uuid.uuid4()))
cs-1 commented 6 years ago

The matter seems to be more complicated. You use self.candidate_cfg_file for different purposes, if I'm correct. You use it as a temporary filename on the host that runs napalm and you also use it as a filename when copying the candidate configuration to the switch (which is never removed -- I'll open another issue for that). I don't know if the above patch breaks the double use of the filename.

cs-1 commented 6 years ago

I've proposed a patch in pull request https://github.com/napalm-automation-community/napalm-aos/pull/20 to fix this problem, however, we can only test the code in limited scenarios so it needs thorough testing.

napalmaos commented 6 years ago

I just have merged your pull request #20 to pull request #22. Please help to test. Thanks.

cs-1 commented 6 years ago

Thanks. I just realized that we missed adding the "ReplaceConfigException". Sorry for the inconvenience.

EDIT: Strange, I do have this exception in my code, however, it seems to have disappeared somehow after the merge.

napalmaos commented 6 years ago

I use candidate_remote_cfg_file to reformat load_replace_candidate and load_merge_candidate in pull request #23. I think we should not use helper function, it will make a mismatch between ReplaceConfigException and MergeConfigException. Please help to test again. Thanks

vmuthukrishnan commented 6 years ago

Hi, Kindly close the issue if it is resolved, otherwise let us know if there any issues still present.