Increase quality of OONI software through testing and quality assurance

Since the launch of the OONI Probe mobile apps, our user base has grown significantly. This has increased the pressure to quickly evolve our measurement engine and to continuously improve upon our user interface. This pressure has resulted in several bugs and issues, such as a memory corruption issue on Android devices and a bug where we were submitting reports to the OONI collector when we shouldn’t. At the same time, the increased amount of users has put additional pressure on our operations, where we need to be very careful when rolling out new versions of the backend, because any small issue could lead to losing a large volume of measurements. This lack of confidence in deploying new versions of software impedes our ability to quickly evolve our software components and leads to stalling pull requests (such as our deployment of a simpler protocol for submitting measurements to the OONI collector).

We recognized that these bugs were not just random glitches, but rather symptoms that, as our user base grows, we need to organically adapt our tools and practices. The first step towards this goal has been to start to consolidate our codebases to use the languages and tools that we know best. This is one of the several reasons underlying the effort to write more Go code for running measurements and more JavaScript code for the UX. The choice of Go, in particular, was motivated by the fact that we were already using Go for several backend services, as well as for implementing the new OONI CLI.

Go and JavaScript are arguably better for writing unit and integration tests, have memory safety, and are the languages our team knows best. Yet, to write higher quality code, more work is still required.

Regarding the measurement engine, we aim for 100% code coverage with unit tests of the codebase, and for writing an environment where we can automatically perform quality assurance of new OONI releases. We aim at ensuring that whenever a developer touches the code, they also see their code from the point of view of its first consumer: the tests. This approach is similar to test driven development, where in theory one should have 100% coverage, and has been copied from the approach that is used by the Google developers at Measurement Lab.

The objective of performing quality assurance of our tests is perhaps even more relevant. This objective calls for improving ooni/jafar, an environment where we can simulate specific censorship policies to ensure that our experiments measure the specific censorship conditions that we aim to detect. As part of this workflow, we will also consider integrating other tools designed to confuse OONI software, such as mhinkie/ooni-detection. The overall goal of this set of activities is to be able to assert that a specific release candidate of OONI is fully delivering on its promises. It is worth noting that, as mentioned above, this is not a theoretical problem, as we have already experienced several defects in our Go measurement engine (for example, we currently have missing keys in our output when running the Facebook Messenger test in some environments).

Moreover, the availability of a testing environment will allow us to implement new forms of censorship, and then measure how our codebase reacts (we can mockingly call this way of writing software CDD, i.e., Censorship Driven Development). Meanwhile, the 100% coverage will allow us to quickly evolve our codebase, increasing our confidence that we will be able to detect mistakes while doing so. To conclude, by combining these two improvements we are confident that we can evolve the measurement engine faster and safely.

The same strategy highlighted above will also be used for the OONI Probe mobile and desktop apps. We will see to increase our code coverage using unit tests, with all the benefits described above, except that they are translated to the apps domain. We will also aim to run more end to end tests, increasing our reliance on tools such as appium, to have more confidence that specific user workflows continue to work as intended, as we refactor the application.

Of course, the strategy described above goes hand in hand with running online continuous integration testing at every commit. We are already running these checks, mostly with Travis CI and Circle CI, and we will be running more tests as we improve our test coverage.

We have experienced fewer bugs in our backend services, but there are several codebases that are now “haunted graveyards”, which no savvy developer would ever touch out of fear of implementing breaking changes. The reason for this situation is that deployment (and rollback) of backend code is complex, and manual testing is required. We therefore plan to reorganise our services so that it is possible to deploy testing code in specific machines, as well as to direct a small portion of production traffic to such machines, to see how they perform. This will be implemented hand in hand with continuous integration driven automatic deployments, when commits are pushed on specific branches. The increased amount of testing, and more easy deployment, will allow us to increase our confidence that we can quickly and safely implement changes also to OONI backends. Thus, we can work on smaller changesets, which are easier to review (and easier to rollback if needed).

ooni / ooni.org

Increase quality of OONI software through testing and quality assurance #349