Integration testing - Githubissues

Recent experiences with failed dataflow job deployment #220 have demonstrated the need for integration testing. Last week I added a Dataflow integration test to pangeo-forge-runner https://github.com/pangeo-forge/pangeo-forge-runner/pull/52. This gives us a baseline confidence that a given release of pangeo-forge-runner works in the test environment used there. It's possible that our application container environment may introduce unique client/dataflow incompatibilities and/or that the way we invoke pangeo-forge-runner (the literal command we use, but also the process context: subprocess, network call to external service, etc.) may introduce problems. In fact, as discussed in #220, it seems that we are facing some type/combination of these issues currently.

A dataflow integration test here could:

Invoke pangeo-forge-runner directly, but within the app container. This would test the client/dataflow incompatibility point mentioned above, but it would not get at the way our application actually invokes pangeo-forge-runner in production. The brainstorm I started in #223 reflects this idea. I've come to feel that this is not sufficiently realistic, and that a better approach would be to...
Invoke pangeo-forge-runner indirectly, by sending a payload to a running instance of the application which triggers the actual code path used in production to deploy jobs to dataflow. This approach is a bit more involved to setup, but ultimately is the more confidence-inspiring and realistic test, as it gets much closer to replicating the production scenario.

Assuming we do want to pursue the latter, more realistic, option, I can envision two ways of getting an instance of the application started to test:

Use docker-compose to stand up a production instance within GitHub Actions, similar to what we already do in the docker tests: https://github.com/pangeo-forge/pangeo-forge-orchestrator/blob/main/.github/workflows/test-docker.yaml. In fact, such an approach could simply build off of that same workflow. The advantage to this approach is that we control the setup of the stack. The disadvantage is that it's less reflective of the actual production environment.
Use the Heroku Review Apps. The setup here is a bit more involved, but IMO the greater reflectiveness of the actual production environment recommends this approach.

I'll start a PR to address this.

A bit more brainstorming before opening a PR:

If we're going to use Heroku Review Apps as the deployment instance to run this test against (IMO the best option), the deployment_status event looks like the best option for the GitHub Workflow trigger. This way, when Heroku updates the Review App's deployment_status.state to success, we know we have a functioning instance to call this test on.
Currently, our Heroku Review Apps are configured to automatically build for every PR. For a while I've actually thought this is not optimal, as it creates a situation where draft PRs accumulate a lot of clutter in the conversation thread, with deployment notifications appearing alongside every push to the PR. In practice, very few PRs (thus far) actually make use of the Review App. And for those that do, this usefulness comes at the mid- to late- stage of work on the PR.
For the Dataflow integration test, automatically deployed Review Apps also have the drawback that, if we rely on the deployment_status event alone, then we are potentially also deploying a Dataflow job at each one of these Review App builds.
The obvious solution is to manually create Review Apps if/when the appropriate stage of work on a PR is reached. Dataflow integration tests would then be automatically invoked when the build phase of that manually created Review App is successfully completed.
Literally manually (as in, clicking Create on the Heroku Dashboard) creating review apps feels quite toilsome, and creates a burden on Heroku team members to sign into the dashboard to do this. A better option looks like using the Review App Create route in the Heroku Platform API. This can probably be called from a GitHub Workflow in response to some low-friction manual event, such as adding a build-review-app label to the PR, which would mirror the PR-label-based developer UI for https://github.com/pangeo-forge/pangeo-forge-runner/pull/52.

So brainstorming a rough work plan for the PR that adds this test:

Try out calling the Heroku Platform API to create/destroy a Review App
Write a GitHub Workflow that creates/destroys Review Apps for a PR based on the creation/removal of a PR label.
Write a separate GitHub Workflow that (to start) just runs based on the deployment_status trigger when the Review App is successfully built
Think about what it will look like to make an API call to the Review App (from this second GitHub Workflow) that triggers a Dataflow Job to be run. This can either be a make GitHub payload sent to the Review App via curl, or it can be a call to the GitHub API creating a /run recipe test comment on an open PR in the pforgetest org. The latter is easy enough, and probably preferable for sake of end-to-end completeness.
If the Review App is going to interact with GitHub (which I think it should) I am not yet sure the best way to grant it GitHub App credentials. IIUC, GitHub Apps cannot be created programmatically without human, in-browser login. So dynamically creating a GitHub App for each Review App is not a viable option. The next-best thing might be to maintain a "roster" of some number of GitHub Apps in pforgetest, and then develop some logic to dynamically assign an un-used one of these apps to the Heroku Review App on creation.

pangeo-forge / pangeo-forge-orchestrator

Integration testing #225