@robertknight requested that I write up an issue detailing the issues with the LMS app's Canvas API code. Here are my notes:

There's both a `CanvasService` and a `CanvasAPIService`

There's both lms/services.canvas.py::CanvasService ("""A high level Canvas service""") which contains only a single method public_url_for_file() and lms/services/canvas_api/client.py::CanvasAPIClient which contains all the other methods, including many methods that're just as high-level as CanvasService.public_url().

The two should be combined into a single service called CanvasAPIClient for consistency with other services like BlackboardAPIClient, D2LAPIClient, MoodleAPIClient, etc.

Duplicated code

It contains duplicated code for calling the requests library to send HTTP requests (handling timeouts, error responses, exceptions, validating responses, etc.) This should be using the shared HTTPService which should be the one-and-only place where requests is used. The duplicated code is split between BasicClient.send() and BasicClient._send_prepared().
It contains duplicated code for authenticating requests with OAuth 2 access tokens (calling OAuth2TokenService to get a token, and putting the token in an Authorization header), for getting access tokens and saving them in the DB, and for refreshing access tokens. This should be using the shared OAuthHTTPService. The duplicated code is in AuthenticatedClient.

This tends to result in poor results and extra work, as we add features to HTTPService, OAuthHTTPService, and other shared code, but forget to also port those features to Canvas, so Canvas users aren't getting the best experience. And then once we realise, we have to do extra work to backport new features to Canvas. Example: https://github.com/hypothesis/lms/pull/3301

Generic code is tightly coupled to Canvas-specific code

For example the code that handles the requests library to send HTTP requests (a generic function) is intermingled with Canvas-specific URL generation, error handling, and pagination. See BasicClient.send() and BasicClient._send_prepared().

This is true of the OAuth code as well.

Unit test fixtures are found in unexpected places

The unit tests for lms/services/canvas_api/_basic.py::BasicClient are in tests/unit/lms/services/canvas_api/_basic_test.py::TestBasicClient, as you'd expect.

Normally the basic_client fixture (which returns the BasicClient instance to be tested) would be defined in _basic_test.py alongside BasicClient's unit tests, as that file would be the only place in the unit tests where a BasicClient instance is constructed. When another file foo.py uses BasicClient, foo_test.py would test foo.py using a mock of BasicClient.

In fact the basic_client fixture is not to be found in _basic_test.py. Instead, the fixture is located in tests/unit/lms/services/canvas_api/conftest.py where it's available to all tests in tests/unit/lms/services/canvas_api/.

Similarly, the authenticated_client fixture, which is used by AuthenticatedClient's unit tests, is also located in conftest.py where it's available all tests in tests/unit/lms/services/canvas_api/.

Locating fixtures in shared conftest.py files like this is unusual in our tests and makes the tests difficult to follow by making it difficult to trace where fixtures come from. Normally fixtures would only be used in one test file and would be defined within that test file, apart from certain very generic fixtures (like pyramid_request etc) that're defined in the top-level tests/unit/conftest.py file.

Even more confusingly, some individual test classes actually override these shared basic_client and authenticated_client fixtures, replacing them with mocks, whereas other tests in the same file use the real fixtures.

Similarly, there's an http_session fixture (a mock requests.Session object) in conftest.py that's used by all the tests in tests/unit/lms/services/canvas_api/, but requests library usage should be isolated into one file and only that one file's unit tests should need to mock requests.

The reason why the Canvas API tests have located these fixtures in shared conftest.py files is that they actually have integrated tests mixed in with the unit tests...

Integrated tests are mixed in with the unit tests

Throughout our Python codebase we use isolated unit tests in the tests/unit/ folder: every file <APP>/foo/bar.py has its unit tests in a tests/unit/<APP>/foo/bar_test.py. For any file gar.py that uses bar.py, gar.py's unit tests will mock bar.py. So every file's unit tests are isolated from every other file.

This test isolation has lots of advantages, including making unit tests much more resilient, easy to understand, and easy to debug, because to understand bar_test.py you only need to understand the contents of bar_test.py and bar.py, and you can put the details of everything else that bar.py uses out of mind for now.

On top of our mountain of unit tests, we also have a small peak of functional tests in the tests/functional/ folder that test the whole app in integration to make sure that it all works together.

This approach of "lots of isolated unit tests (in tests/unit/) plus a few integrated functional tests (in tests/functional/)" is applied consistently throughout all our projects.

The Canvas API unit tests folder deviates from this approach: it contains a mixture of unit tests and integrated tests. There are tests for CanvasAPIClient (in client_test.py) that also exercise the real AuthenticatedClient and BasicClient, and there are tests for AuthenticatedClient that also exercise the real BasicClient. The tests for CanvasAPIClient rely much more heavily on these integrated tests than on isolated unit tests.

It's surprising to find integrated tests in a folder (tests/unit/) that's for unit tests. It's inconsistent with our usual approach to unit tests. And the reliance on integrated tests to do too much of the testing work (rather than relying mostly on isolated unit tests and just a few functional tests) results in difficult test maintenance problems. For an example of the kind of problems this causes, see: https://github.com/hypothesis/lms/pull/2729 (and just look at how much text it takes me to even explain what is going wrong with the integrated test: the bug in the test code is that long and twisted). https://github.com/hypothesis/via/pull/222#discussion_r469865055 is another example of a similar problem in some tests in Via.

How should it work?

BlackboardAPIClient is a good example of the right way to design a service like this and its unit tests:

services/blackboard_api/_basic.py::BasicClient uses the shared OAuthHTTPService to do its work.
The Blackboard code has no need to build its own AuthenticatedClient to add OAuth functionality on top of its BasicClient: the BasicClient is already getting OAuth support by using the shared OAuthHTTPService.
Validation schemas are defined in a separate file and unit-tested separately, not mixed in with the API methods and their tests.
The unit tests for BasicClient use a mock of the shared OAuthHTTPService so we're not duplicating OAuthHTTPService's own unit tests or tangling things together.
There's no mention of the requests library anywhere in the Blackboard API code or its tests (and no mock requests). requests is used by the shared OAuthHTTPService and the Blackboard API code uses that service (and mocks of that service in its tests) but is unaware and requests is being used under-the-hood.
The unit tests for the higher-level BlackboardAPIClient use a mock of the lower-level BasicClient so again we're not duplicating BasicClient's own tests or tangling things together. Similarly they also use mocks of the validation schemas.
Overall it's less code; it's simpler; there's no duplication of HTTP and OAuth code; and everything is better isolated and decoupled, which makes it easier to understand and to make changes to.
Consistency with the coding style used throughout the rest of our codebase is also a plus.

hypothesis / lms

Canvas API code design is poor #6338

There's both a `CanvasService` and a `CanvasAPIService`

Duplicated code

Generic code is tightly coupled to Canvas-specific code

Unit test fixtures are found in unexpected places

Integrated tests are mixed in with the unit tests

How should it work?

hypothesis / lms

Canvas API code design is poor #6338

There's both a CanvasService and a CanvasAPIService

Duplicated code

Generic code is tightly coupled to Canvas-specific code

Unit test fixtures are found in unexpected places

Integrated tests are mixed in with the unit tests

How should it work?

There's both a `CanvasService` and a `CanvasAPIService`