Spurious warnings about DbContext health checks failing in FwHeadless container

Describe the bug I randomly get warnings like the following in the FwHeadless logs:

fail: Microsoft.EntityFrameworkCore.Database.Connection[20004]
      An error occurred using the connection to database 'lexbox' on server 'tcp://db:5432'.
fail: Microsoft.Extensions.Diagnostics.HealthChecks.DefaultHealthCheckService[103]
      Health check LexBoxDbContext with status Unhealthy completed after 1289.0786ms with message 'A task was canceled.'
      System.Threading.Tasks.TaskCanceledException: A task was canceled.
         at Npgsql.TaskTimeoutAndCancellation.ExecuteAsync(Func`2 getTaskFunc, NpgsqlTimeout timeout, CancellationToken cancellationToken)
         ...
         at LexData.LexBoxDbContext.HeathCheck(CancellationToken cancellationToken) in /src/backend/LexData/LexBoxDbContext.cs:line 38
         at Microsoft.Extensions.Diagnostics.HealthChecks.DbContextHealthCheck`1.CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken)
info: Microsoft.Hosting.Lifetime[0]
      Application is shutting down...

Interestingly, although the logs say that the application is shutting down, it is in fact running (or another pod has just started but I'm still seeing the logs form the pod that shut down). At any rate, hitting the /sync API with a POST request works correctly, and the logs then show the response to that POST request so it's unlikely that I'm looking at logs from an older pod.

I notice that the failing LexBoxDbContext check looks like this:

    public async Task<bool> HeathCheck(CancellationToken cancellationToken)
    {
        //this will throw if we can't connect which is a valid health check response.
        await Users.CountAsync(cancellationToken);
        return true;
    }

It makes sense for most exceptions from Users.CountAsync to count as health check failures. But according to https://stackoverflow.com/questions/60474213/asp-net-core-healthchecks-randomly-fails-with-taskcanceledexception-or-operation, the health check's cancellation token will throw TaskCanceledException if the HTTP request for /api/healthz was aborted; this can happen if a second health check runs before the first one is completed. Which explains why I usually see this after I start up a fresh FwHeadless container.

I think we should be catching TaskCanceledException and returning true from the LexBoxDbContext health check if the task was canceled.

To Reproduce Can't reproduce consistently, as it's timing dependent. But I most often see this after running tilt up or task up for the first time.

Expected behavior Aborted HTTP requests to /api/healthz should not result in an unhealthy status check result: we should catch TaskCanceledException and return true or Healthy from the health-check method so that ASP.NET doesn't log spurious health check failures.

sillsdev / languageforge-lexbox

Spurious warnings about DbContext health checks failing in FwHeadless container #1211