Hi! I saw you were asking about unit testing on Twitter and thought it might be useful to see an example of how it can be set up, with a few test cases, so you can experiment and add some more yourself.

I'll have a go explaining what's going on here, and feel free to ask any questions!

Run locally

This PR adds some unit tests that can be run with pytest. To run locally:

pip install pytest  # install the test runner
pip install -e .  # install xword-dl locally in "editable" mode, only need to do this once
pytest  # run the tests!
# now write more code and/or tests
pytest  # run the tests again!

CI

This PR also has config to run these tests on GitHub Actions. That means future PRs will have tests run too, which is one of the best things of unit tests: when you fix something, add a test for it, and then you've got a safety net that future changes won't cause surprise regressions and break things.

It tests on all of Python 3.5-3.9 (3.4 isn't available), on Ubuntu, macOS and Windows. For example:

https://github.com/hugovk/xword-dl/actions/runs/449400063

The CI also sends something called "coverage" to a free (for open source) service called Codecov. This shows us the lines of code that were actually run when running tests, and can help us see what bits of code aren't yet tested. For example:

https://codecov.io/gh/hugovk/xword-dl/src/d2a55829efdb84963b531e7f98d7981cfd3ee188/xword_dl.py

Tests

The idea of unit tests is to test your code in isolation, where you're testing the "units" of your code, typically functions or classes, and not a full end-to-end "integration test". A good thing about them is they run fast and don't rely on, for example, any network resources. There's still place for integration tests, but unit tests can be very useful.

File structure

Test files are in a tests/ directory and begin with test_. They could all be one big file, but I've split them up for clarity.

test_xword_dl.py is for testing any global functions like remove_invalid_chars_from_filename() and TestBaseDownloader class
test_amuselabs.py for classes based on AmuseLabsDownloader
test_wsj.py for the WSJDownloader class
and other files can be added for the other classes

Test structure

I like to keep tests very simple. We want to throw some input at a function, then check the output is as expected. I use "Arrange/Act/Assert" to explicitly group each bit of the test to make it very clear what's going on.

Arrange: here we set up any data or objects we're going to need for the test
Act: here we actually run the code under test and grab the output
Arrange: finally, we usually want to check the output is something sensible

Here's one:

def test_remove_invalid_chars_from_filename():
    # Arrange
    invalid_filename = r'<my> filename:"/\|?*'

    # Act
    filename = xword_dl.remove_invalid_chars_from_filename(invalid_filename)

    # Assert
    assert filename == "my filename"

These test functions should begin test_.

With the Python unittest module, there's a whole bunch of assert methods like assertEqual(), assertTrue() and assertFalse().

But we're using pytest, and most of the time just assert something: assert x == y, assert x, assert not x.

I used one special one called pytest.raises() to check a base downloader method isn't implemented:

class TestBaseDownloader:
    def test_find_solver(self):
        # Arrange
        downloader = xword_dl.BaseDownloader()
        url = "https://example.com"

        # Act / Assert
        with pytest.raises(NotImplementedError):
            downloader.find_solver(url)

You can see in this one I put it in a class too, to collect together tests for a given base class too (similar to Thea's guide). This isn't a must, but can help keep things grouped neatly.

Mocking

In a perfect world, we shouldn't be running anyone else's imported code, and replacing it with mocked or fake stubs. Pragmatically, I often don't worry about that for the stdlib, and we're also using puz here too. I think that's probably okay.

One thing I'd avoid testing is anything that makes network calls like requests. We don't want to make slow network calls, they can be flaky and fail if there's a temporary network problem, and it's not good to hit a third-party API/site too often.

A few options:

Don't test stuff hitting the network
Refactor non-network parts into another function; test that
Mock the network request: this involves writing a fake version of, say, requests.get(url) to return an example of what it would return in real life

I think there's enough here already so didn't include it here, but let me know and I can make an example of that too!

Run locally with coverage

One last thing, you can also check coverage when running locally using a plugin and some switches:

pip install pytest pytest-cov # install the test runner and its coverage plugin
pip install -e .  # as before
# run the tests:
# tell it to check coverage of production and tests code
# and show a summary in the terminal
# and generate html output
pytest --cov xword_dl --cov tests --cov-report term --cov-report html
open htmlcov/index.html  # check the results!

thisisparker / xword-dl

Add some unit tests and CI #26