hypothesis / lms

LTI app for integrating with learning management systems
BSD 2-Clause "Simplified" License
45 stars 14 forks source link

Canvas API code design is poor #6338

Open seanh opened 4 weeks ago

seanh commented 4 weeks ago

@robertknight requested that I write up an issue detailing the issues with the LMS app's Canvas API code. Here are my notes:

There's both a CanvasService and a CanvasAPIService

There's both lms/services.canvas.py::CanvasService ("""A high level Canvas service""") which contains only a single method public_url_for_file() and lms/services/canvas_api/client.py::CanvasAPIClient which contains all the other methods, including many methods that're just as high-level as CanvasService.public_url().

The two should be combined into a single service called CanvasAPIClient for consistency with other services like BlackboardAPIClient, D2LAPIClient, MoodleAPIClient, etc.

Duplicated code

This tends to result in poor results and extra work, as we add features to HTTPService, OAuthHTTPService, and other shared code, but forget to also port those features to Canvas, so Canvas users aren't getting the best experience. And then once we realise, we have to do extra work to backport new features to Canvas. Example: https://github.com/hypothesis/lms/pull/3301

Generic code is tightly coupled to Canvas-specific code

For example the code that handles the requests library to send HTTP requests (a generic function) is intermingled with Canvas-specific URL generation, error handling, and pagination. See BasicClient.send() and BasicClient._send_prepared().

This is true of the OAuth code as well.

Unit test fixtures are found in unexpected places

The unit tests for lms/services/canvas_api/_basic.py::BasicClient are in tests/unit/lms/services/canvas_api/_basic_test.py::TestBasicClient, as you'd expect.

Normally the basic_client fixture (which returns the BasicClient instance to be tested) would be defined in _basic_test.py alongside BasicClient's unit tests, as that file would be the only place in the unit tests where a BasicClient instance is constructed. When another file foo.py uses BasicClient, foo_test.py would test foo.py using a mock of BasicClient.

In fact the basic_client fixture is not to be found in _basic_test.py. Instead, the fixture is located in tests/unit/lms/services/canvas_api/conftest.py where it's available to all tests in tests/unit/lms/services/canvas_api/.

Similarly, the authenticated_client fixture, which is used by AuthenticatedClient's unit tests, is also located in conftest.py where it's available all tests in tests/unit/lms/services/canvas_api/.

Locating fixtures in shared conftest.py files like this is unusual in our tests and makes the tests difficult to follow by making it difficult to trace where fixtures come from. Normally fixtures would only be used in one test file and would be defined within that test file, apart from certain very generic fixtures (like pyramid_request etc) that're defined in the top-level tests/unit/conftest.py file.

Even more confusingly, some individual test classes actually override these shared basic_client and authenticated_client fixtures, replacing them with mocks, whereas other tests in the same file use the real fixtures.

Similarly, there's an http_session fixture (a mock requests.Session object) in conftest.py that's used by all the tests in tests/unit/lms/services/canvas_api/, but requests library usage should be isolated into one file and only that one file's unit tests should need to mock requests.

The reason why the Canvas API tests have located these fixtures in shared conftest.py files is that they actually have integrated tests mixed in with the unit tests...

Integrated tests are mixed in with the unit tests

Throughout our Python codebase we use isolated unit tests in the tests/unit/ folder: every file <APP>/foo/bar.py has its unit tests in a tests/unit/<APP>/foo/bar_test.py. For any file gar.py that uses bar.py, gar.py's unit tests will mock bar.py. So every file's unit tests are isolated from every other file.

This test isolation has lots of advantages, including making unit tests much more resilient, easy to understand, and easy to debug, because to understand bar_test.py you only need to understand the contents of bar_test.py and bar.py, and you can put the details of everything else that bar.py uses out of mind for now.

On top of our mountain of unit tests, we also have a small peak of functional tests in the tests/functional/ folder that test the whole app in integration to make sure that it all works together.

This approach of "lots of isolated unit tests (in tests/unit/) plus a few integrated functional tests (in tests/functional/)" is applied consistently throughout all our projects.

The Canvas API unit tests folder deviates from this approach: it contains a mixture of unit tests and integrated tests. There are tests for CanvasAPIClient (in client_test.py) that also exercise the real AuthenticatedClient and BasicClient, and there are tests for AuthenticatedClient that also exercise the real BasicClient. The tests for CanvasAPIClient rely much more heavily on these integrated tests than on isolated unit tests.

It's surprising to find integrated tests in a folder (tests/unit/) that's for unit tests. It's inconsistent with our usual approach to unit tests. And the reliance on integrated tests to do too much of the testing work (rather than relying mostly on isolated unit tests and just a few functional tests) results in difficult test maintenance problems. For an example of the kind of problems this causes, see: https://github.com/hypothesis/lms/pull/2729 (and just look at how much text it takes me to even explain what is going wrong with the integrated test: the bug in the test code is that long and twisted). https://github.com/hypothesis/via/pull/222#discussion_r469865055 is another example of a similar problem in some tests in Via.

How should it work?

BlackboardAPIClient is a good example of the right way to design a service like this and its unit tests:

seanh commented 4 weeks ago

I'm not sure how useful this is as an actually issue for planning purposes, rather than just as notes for reference. So we may just want to close the issue?