AgPipeline / issues-and-projects

Repository for issues and projects
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Better code testing coverage for Plotclip algorithms #356

Open Chris-Schnaufer opened 3 years ago

Chris-Schnaufer commented 3 years ago

Task to do Improve code testing coverage

Reason Better code coverage testing improves code quality

Result Up to 80% code coverage

See https://github.com/AgPipeline/transformer-plotclip/pull/27#issuecomment-727621707 (ignore any comments that are not code coverage specific - such as documentation and Mypy related [Mypy is in a separate issue])

julianpistorius commented 3 years ago

Idea: Run docker tests with coverage, instead of trying to create more unit tests

Chris-Schnaufer commented 3 years ago

Idea: Run docker tests with coverage, instead of trying to create more unit tests

Getting more complete testing coverage (with and/or without Docker) would be great!

julianpistorius commented 3 years ago

@Chris-Schnaufer I might need your help figuring out how to combine GitHub artifacts from multiple jobs. Do you have any examples of this by any chance?

Chris-Schnaufer commented 3 years ago

@julianpistorius I'm not clear on what you're trying to do - It's possible to use files and such left over from a previous job in a downstream job within the same YAML file (not across matrix jobs I believe); having GitHub store artifacts are only available to downstream jobs (same caveats as previous) with the added benefit of GitHub users being able to download the artifacts; it's possible to push artifacts to the GitHub repo where they are available to everyone and everything.

Can you clarify what it is that you want to do?

julianpistorius commented 3 years ago

@Chris-Schnaufer Thanks for the summary. I need to combine:

A. coverage.txt file (containing XML) from the pytest job (defined intesting_checks.yaml ) B. coverage.xml file generated in docker_testing job (defined in testing_docker.yaml) (WIP)

I can use the coverage combine command if I can get access to both files.

Then I need to post the combined coverage.xml to codecov.io

Chris-Schnaufer commented 3 years ago

@julianpistorius If I'm understanding correctly, you can use actions/upload-artifact@v2 as demonstrated at https://github.com/AgPipeline/transformer-plotclip/blob/932941dbbeb890dcf74d5e80bfe4ac04fdd50581/.github/workflows/testing_docker.yaml#L39 to save the artifacts for the Action run (coverage.txt and coverage.xml). These artifacts are retained for a period of time and then removed.

You can download the artifacts in another job (same YMAL file) using download artifact. See https://docs.github.com/en/actions/guides/storing-workflow-data-as-artifacts#downloading-or-deleting-artifacts for some more info

Here's an example for uploading codecov results from GitHub actions: https://about.codecov.io/blog/how-to-set-up-codecov-with-c-and-github-actions/. This is a "C" language link, but the GitHub actions part should work for Python (ie: ignore non-GitHub Actions portions).

Let me know if this is what you're looking for, or not

julianpistorius commented 3 years ago

You can download the artifacts in another job (same YMAL file) using download artifact.

@Chris-Schnaufer Any idea how to do this between different workflows (different YAML files)?

If this is not easy/possible, what do you think about combining the jobs from testing_checks.yaml and testing_docker.yaml into one YAML file?

julianpistorius commented 3 years ago

@Chris-Schnaufer Here's my plan:

Let me know what you think.

Chris-Schnaufer commented 3 years ago

You can download the artifacts in another job (same YMAL file) using download artifact.

@Chris-Schnaufer Any idea how to do this between different workflows (different YAML files)?

If this is not easy/possible, what do you think about combining the jobs from testing_checks.yaml and testing_docker.yaml into one YAML file?

One way to do this is to put the resulting docker image onto github container registry (and perhaps other features). This would require some coordination between YAML files: I'm going to look into this and update this comment

julianpistorius commented 3 years ago

One way to do this is to put the resulting docker image onto github

The problem isn't getting access to the docker image. It's running the docker tests while generating a code coverage report, and then combining it with the coverage report from the pytest workflow.

It seems the there is no easy way to make one workflow (YAML file) which depends on the output of another workflow. Hence combining the two workflows.

Chris-Schnaufer commented 3 years ago

I see. Perhaps the solution would be to commit to a specially named branch and have a workflow that is triggered by the push of file(s) to that branch which then performs the merge?

julianpistorius commented 3 years ago

commit to a specially named branch and have a workflow that is triggered by the push of file(s) to that branch which then performs the merge?

That seems to introduce more complexity than my plan, but I'm open to exploring it. Do you have an example of this in action? (No pun intended! 😆 )

Question: Is there a reason why you'd prefer to keep the workflow files separate?

Chris-Schnaufer commented 3 years ago

Question: Is there a reason why you'd prefer to keep the workflow files separate?

They run faster overall & it seemed like a logical separation. No other reasons. We should make changes as we want, as long as they make sense (a somewhat relative concept)

That seems to introduce more complexity than my plan, but I'm open to exploring it. Do you have an example of this in action? (No pun intended! 😆 )

I haven't done that yet. Check out https://stackoverflow.com/questions/57921401/push-to-origin-from-github-action or https://github.com/marketplace/actions/github-push#usage for hints (or use the action)

I did run across the following which may help with what you're trying to do: https://docs.github.com/en/rest/reference/checks#runs. You could have a script that waits (also check https://github.com/marketplace/actions/wait-on-check).

julianpistorius commented 3 years ago

Thanks Chris! That wait-on-check action looks promising.

julianpistorius commented 3 years ago

@Chris-Schnaufer After sleeping on it remembered that individual jobs within a workflow run in parallel:

https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobs

Here is my modified plan, based on the previous one above:

This means that all the jobs will run in parallel, except for combine_and_upload_coverage, which should be very small and fast. This modified single-workflow approach gets us the same benefits as multiple workflows, but with fewer moving parts.

What do you think about this new plan?

Chris-Schnaufer commented 3 years ago

@julianpistorius I think this looks pretty good. I do have a fundamental question, and some other questions as well.

Fundamental Question Q: why test code coverage in a Docker image as opposed to outside of a Docker image? If there's something special about the Docker image, is it possible to add that to the YAML script(s) so that this complexity is avoided? The code inside of the Docker image should match what's outside (or the process is very broken), therefore the code coverage should be the same; in this case couldn't we just do a code comparison if we needed to confirm sameness? (Sorry, more than one question)

Comments: I see the separation of the two YAML files as addressing separate testing concerns - with and without Docker, unit testing and application testing. There are tests that are common between Docker and non-Docker use cases, and some that are Docker only. I'm not sure that this is an artificial distinction or not, given how the YAML files are implemented.

Other questions

Combine testing_docker.yaml and testing_checks.yaml workflow files

The code currently runs outside of a Docker environment, as well as with Docker. This approach would no longer test outside of Docker (if I'm reading this right - the quoted line above and following lines). If so, we're now removing a test case that we used to support. I would prefer to build on what we have if possible: Using the wait-on-check approach, is it possible to let a Docker image be built first and then tested in testing_checks.yaml?

Add a job called combine_and_upload_coverage

Are you thinking of using the GitHub Action feature where jobs can be dependent upon other jobs? I assume so, but thought I'd ask. Example: https://github.com/AgPipeline/transformer-plotclip/blob/932941dbbeb890dcf74d5e80bfe4ac04fdd50581/.github/workflows/testing_docker.yaml#L73

julianpistorius commented 3 years ago

why test code coverage in a Docker image as opposed to outside of a Docker image

Because I want to generate code coverage for realistic real-world usage (not just for unit tests). Hence running integration tests (Docker tests) with coverage enabled.

This approach would no longer test outside of Docker (if I'm reading this right - the quoted line above and following lines).

No, it would still test code outside of Docker. Everything would keep working as it does at the moment, but we would have one workflow for all tests (inside and outside Docker).

Are you thinking of using the GitHub Action feature where jobs can be dependent upon other jobs?

Exactly.

Chris-Schnaufer commented 3 years ago

@julianpistorius Makes sense. Thanks!