napalm-automation / napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Apache License 2.0
2.25k stars 553 forks source link

Proposal: A Comprehensive Continuous Integration Toolchain for NAPALM #236

Closed Mierdin closed 7 years ago

Mierdin commented 8 years ago

Proposal: A Comprehensive Continuous Integration Toolchain for NAPALM

Authors:

Created: April 2016 Last updated: April 2016

Abstract

This proposal discusses the contribution of continuous integration tools, scripts, and configuration files to the NAPALM project by the network software team at eBay. It is our desire that the community provides feedback on this proposal prior to these contributions.

As a project, NAPALM is aimed at addressing a very critical problem in network automation: building software to control heterogenous network infrastructure (all with their own unique and "interesting" programmability options) is a complicated matter. A layer of simplification on top of these diverse APIs (or worse in some cases) is very clearly needed to ensure quality in network automation software, and continued network uptime.

Such a critical piece of software needs to have tooling in place that allows any developer to know if the code they're planning to contribute is going to help or hurt the project. eBay's network software team has made use of many of the tools presently used within the OpenStack community to help ensure that their software conforms to quality standards.

Background

eBay's Involvement in OpenStack

It is well-known that eBay is heavily involved with the OpenStack community, and uses the project for numerous internal infrastructure initiatives. As a result, leveraging the available continuous integration tools and configurations present in many of the OpenStack member projects was a no-brainer when it came time to start building network automation software.

Leveraging OpenStack's CI tooling and scripts meant not spending time figuring out how to enforce things like PEP8 compliance, or unit test structure/coverage. These tools were already set up, and at most, some configuration files may need to be changed. This generates immediate value, through enforcement of quality software, without too much effort.

From this process, eBay's network software team has used these tools and configurations for all Python-based software projects (not necessarily related to OpenStack) to ensure a codebase that does not compromise on quality, even with a diverse team of developers.

Contribution of Continuous Integration Toolchain/Configuration to NAPALM

This document proposes that these tools should be leveraged by the NAPALM project, and all code (present and future) is changed where needed to conform to the rules enforced by the CI tools.

The specifics of what rules are ignored/enforced, etc. is beyond the scope of this proposal. Such a discussion can and should be had when a PR is opened, and all can see the configuration files in question.

Central to this CI toolchain is Tox. Tox is used as the central point of logic for all unit testing, tempest/integration testing, code style (including but not limited to PEP8) and more. Tox provides a mechanism to automate continuous integration efforts, all within Python virtualenv sandboxes (which is also automated).

As mentioned before, this tooling is very similar to what you would see in an OpenStack project (see Neutron for an example), but here's a brief summary of what would be added:

It is crucial that these tools are automatically used and enforced when a pull request is opened, in order to ensure they continue to provide value. So, we propose that a build step is added to the build server (TravisCI, etc) that runs the following Tox environments, and requires that they pass before a PR is considered:

All of these changes would be performed on a single NAPALM driver repository initially, as a proof of concept. Once all of the wrinkles are ironed out, the changes would be replicated across the entire project.

Contribution of Code Style Fixes to NAPALM

In integrating the aforementioned tools into NAPALM, eBay's network software team also made some modifications to the NAPALM code itself, in order to conform to the rules enforced by flake8, pylint, and more. These particular changes were not functional in nature, and affected only the aesthetics and readability of the code. All efforts were made to preserve the existing functionality of NAPALM during these changes.

Since using these CI tools would mean that the existing codebase would fail to run Tox successfully until these changes are replicated on the open source NAPALM project, this document also proposes that these style improvements are also contributed - initially to the single driver being used as a PoC, but then elsewhere, where such improvements were performed.

Summary of Proposed Changes

If this proposal is generally favored, here is a list of changes that we are willing to contribute to this effort.

jzohrab commented 8 years ago

Thanks for the detailed write up and effort. Some questions:

Thanks again, and regards.

Mierdin commented 8 years ago

Thanks @jzohrab - replies below:

Thanks for your feedback! Hopefully I answered your questions.

jzohrab commented 8 years ago

"Devs will need to make sure their code is compliant with what the tooling presents. It's really easy to do this" - to simplify onboarding etc, can you make sure that the necessary tools are present in requirements.txt and/or setup.py, and that at least one sentence is added to the developer readme? I think that David wanted to change some things with the development stream ... happy to help work out the kinks on that.

Thanks @Mierdin, will be really interesting to see this in action, a full suite of device tests would be great.

dbarrosop commented 8 years ago

@Mierdin I really like the proposal. Can't wait to see the PR. Although I am not looking forward to the process of cleaning the code so flake8 and pylint are happy ;) I think that it would be easier and less painful if people would be willing to joing for a 'hackathon' dat when we focus on putting all the tooling together and cleaning the code. The branch alternative would work as well but might lead to a lot of conflicts and it might delay putting the tooling in place and I think the sooner we deal with this the better.

@jzohrab don't worry, we will have a dev-requirements.txt or something like that so all the tools are easily installed with pip. It's better to have a separate file as you don't need those development tools in production. Once we have clear understanding of how all the new tools and everything works we can work on a 'guidelines' file for contributing to the project and an 'on-boarding' file for getting started. Something tells me you want to take care of the latter ;)

Mierdin commented 8 years ago

@jzohrab So, since we're using Tox, the dependencies are very easy (since Tox is basically virtualenv automation). We have a requirements.txt file which is automatically installed into each Tox virtualenv, but we also have a test-requirements.txt file that contains all of the test-specific dependencies. As @dbarrosop points out, this is only for dev, so you don't use these dependencies for actually running the software in production. Whenever you do PEP8 checks with tox -epep8 or unit tests with tox -epy27 Tox will automatically install dependencies from both files to ensure it has everything it needs. Since this is all within virtualenvs managed by Tox, it's fairly self-contained.

@dbarrosop We've done as much of those fixes as we were able to do given our other priorities. As much as possible, our PR will contain those fixes. We'll start with the junos driver as our POC for all this stuff and see how it goes. A hackathon might be a great idea for the rest.

dbarrosop commented 8 years ago

Hi! Any update on this?

Mierdin commented 8 years ago

Not yet. Lots of post-Interop catch up. :) Still hoping to get this done soon. Maybe this week if I can find a blank spot.

Will keep you all posted

Mierdin commented 8 years ago

Just a quick update - I am really only fighting a single issue that's standing in the way of me publishing these changes (have been at this stage for a few weeks actually).

In short, something about Tox (not even our specific Tox configuration - as I've tested with a very minimal tox configuration to produce these results as well) is causing one of the nosetests to fail in a really weird way. In this test, pyez is returning the mocked data correctly, but the list of tuples being returned is not in the right order, which really screws with the function being tested. If I run the same test directly with nose, instead of through Tox, the tuples are in the right order, and the unit test passes.

Rather than submit something that doesn't work, I've been spending a lot of time troubleshooting this, without success thus far. I have a lot of other deliverables to worry about, so I've had to withdraw my attention from this from time to time. However, at this point I am considering opening a PR anyways, to get more eyes on the issue. I will let you all know if I get to this point.

Hopefully that makes sense - if you would like to hear more details, let me know.

dbarrosop commented 8 years ago

Could you post the issue? Depending on what it is maybe I can slightly change it to fix it.