GitHub Actions workflow for running Automated Tests

MukuFlash03 commented 2 weeks ago

The eventual goal is to have minimal manual intervention when testing latest images. Essentially, we want to run the automated tests on every rebuild of the image so we know if the images are good enough to merge.

There are two scripts involved in running the demo tests:

Another requirement is that we need to run tests on different operating systems, hence the GitHub actions workflow must run on atleast ubuntu (for now) and MacOS later on. MacOS with M1 chips has faced issues in the past and could be tricky to get configured.

Initial PR by shankari. Manual testing steps highlighted in this PR.

MukuFlash03 commented 2 weeks ago

Related Commits / PRs

Listing some possibly related commits from the codebase history which could be helpful

Note: These changes were not introduced by me.

Demo Automated Tests: PR link

“pip install pytest”: Apr 15, 2024 (1-3)

tests/startup_tests.py: Dec 6, 2023 (1-2) ; Apr 15, 2024 (3)

Scripts updated to use file path “ext/source/” : commit link

Added Automated Tests related PR link

Automated test docker compose file, bash script, run-test.sh: Dec 6, 2023 (1)

Added first here
run-test.sh does not contain “$ pip install pytest” when it was first added.

Note: These changes were not introduced by me.

Some more commits made prior to this PR:

Removed macOS since docker not pre-installed: commit link
Switched to macOS completely; removed ubuntu: commit link
Host folder /var/folders added: commit link
Timeout temporary, can be removed once automated tests added: commit link
Timeout bumped to 30 mins; enough time for 2nd .sh file to run: commit link

MukuFlash03 commented 2 weeks ago

Starting off with getting demo-automated-testing.sh to run

Docker setup process in ubuntu vs in macOS

Ubuntu: Don’t need to install docker as it looks like it comes pre-installed.
macOS: Using a custom github action available [here](douglascamata/setup-docker-macos-action: A Github Action to setup Docker on macOS runners.).

Specific details available here in the official list of runner images.

Where does startup_tests.py come from? It isn't there in this repo?

Comes from everest-core as seen in this comment.

For the commit where startup_tests.py was first added, saw this comment:

# TODO: This should be removed once added to everest-core
COPY ./tests/startup_tests.py /ext/source/tests/core_tests/startup_tests.py

So, it looks like, the absence of the file should not cause any errors since it should be there in everest-core.
Yes, startup_tests.py present here in everest-core
Also, looking at Dockerfile in everest-demo/manager/ , I see that it clones the everest-core git repo, copies run-test.sh to /ext/source which is a folder created in the Dockerfile that renameseverest-core/ to /ext/source.

MukuFlash03 commented 2 weeks ago

Timeout causing jobs to get cancelled

Found a related commit here that highlights this issue with the main reason for using timeout being that the docker compose scripts run without the -d flag and hence the process stays active in the terminal.

So for now, ignoring this. Will have to come back to it later.

MukuFlash03 commented 2 weeks ago

Demo-automated-testing.sh passed for ubuntu

Here is the successful: workflow run

This is output seen in the workflow logs:

manager-1      | ============================== 2 passed in 18.61s ==============================

I hope this is the right indicator that the current automated tests defined by the test script have passed.

MukuFlash03 commented 2 weeks ago

Added macOS runner to run automated tests

MacOS docker setup passed finally using the same custom github action used in this PR. I did have to change the macOS version from macos-latest to macos-13 as the github action to setup docker wasn't compatible with the latest macos-14.

Timeout increased to 60 minutes

Increasing the timeout since the Downloading and Extracting steps themselves have taken about 20 minutes in the macOS runner.
But right now even for the demo-automated-tests.sh on MacOS its taking close to 40-45 minutes just for the docker image download + extraction to complete.
This is just for the macOS runner; ubuntu job completes pretty quickly.

As seen in this commit as well, when Shankari changed the timeout to 30 minutes, it was because the demo-iso15118-2-ac-plus-ocpp.sh script itself was taking close to 30 minutes.

MukuFlash03 commented 2 weeks ago

Observations for Workflow Runs with macOS

Test 1:

This macOS run passed - Pytest tests passed successfully. However, the download and extraction of docker image itself takes close to 30 minutes, so had to increase timeout to 60 minutes.

Test 2:

For details on triggering events on PR vs non-PR (or on push) events, see this comment.

Currently a PR is open into my local branch - automate-tests-merge; hence pushes trigger workflow for this as well. cicd.yaml run.

But since this is a PR, the e2etest.yaml correctly skipped the jobs when triggered by a PR: e2etest.yaml run.

jobs:
pull-and-run-tests:
if: github.event.inputs.parent_workflow == 'cicd.yaml' && github.event.inputs.event_name != 'pull_request'
runs-on: ${{ matrix.os }}

Then added workflow dispatch to so that it triggers whenever cicd.yaml workflow is completed. Ensuring that the dispatch job runs only when docker job completes successfully in cicd workflow by using “needs” attribute. cicd.yaml run.
Successfully triggered via the workflow run that itself was triggered by a non-PR event (push commit). e2etest.yaml run.

However in this macOS run, 1 test failed in the pytest startup_tests. Usually there are two tests and both pass.

1 test failed with error message:

manager-1      |         if status == None or len(status) == 0:
manager-1      | >           raise TimeoutError("Timeout while waiting for EVerest to start")
manager-1      | E           TimeoutError: Timeout while waiting for EVerest to start
manager-1      | 
manager-1      | /usr/lib/python3.10/site-packages/everest/testing/core_utils/everest_core.py:198: TimeoutError

This failure error has been observed before as seen here.

Test 3:

Testing again by pushing changes. Another different error observed in docker setup process for macOS in this run.

Test 4:

Testing again in this run, specifically to observe MacOS runner. Docker setup successful.

However, yet again 1 test failed, 1 test passed. But the error message was different from the error seen in previous run:

manager-1      | >       assert probe.test(20)
manager-1      | E       assert False
manager-1      | E        +  where False = <bound method ProbeModule.test of <startup_tests.ProbeModule object at 0x7ebd56d50a90>>(20)
manager-1      | E        +    where <bound method ProbeModule.test of <startup_tests.ProbeModule object at 0x7ebd56d50a90>> = <startup_tests.ProbeModule object at 0x7ebd56d50a90>.test
manager-1      | 
manager-1      | core_tests/startup_tests.py:101: AssertionError

MukuFlash03 commented 2 weeks ago

Queries before moving onto demo-iso15118-2-ac-plus-ocpp.sh

1. Which version of the bash script parameters to use?

Shankari’s command for the AC script was valid as per this Readme.md.

🚨 Basic and ISO 15118-2 AC Charging with OCPP 201 CSMS (MaEVe) ⚡: 
$ curl https://raw.githubusercontent.com/everest/everest-demo/main/demo-iso15118-2-ac-plus-ocpp201.sh | bash

But latest code in EVerest/everest-demo parent repo has the new script which was added in the single demo script added in this commit.

Bash command has different arguments passed to choose which demo to run.

🚨 Basic and ISO 15118-2 AC Charging with OCPP 2.0.1 CSMS (MaEVe Security Profile 1) ⚡: 
$ curl https://raw.githubusercontent.com/everest/everest-demo/main/demo-iso15118-2-ac-plus-ocpp.sh | bash -s -- -1

2. Should all the command versions run in the workflow?

But these scripts run docker containers without -d flag which causes them to stay up and running. Even if -d flag was supplied, they might be using the same port so would need to stop the containers before executing other scripts.

MukuFlash03 commented 2 weeks ago

Concerns with current process for image build push

My understanding of the requirements of this PR w.r.t. automated testing

I get that the requirement is to run the automated tests on every rebuild of the images so that we know if the changes are good to run.
However, in the cicd.yaml file, I see that the latest images are only pushed to the GHCR registry on a non-pull-request event.

My understanding of the process for a non-PR or normal push event

From what I see the flow probably is that firstly a non-PR or normal push or commit is made to main branch.
Then the image is built and pushed to the GHCR registry and then the tags can be updated.
Now, for future changes / PRs, the code will use this latest image with the latest code.

What's my concern?

Now this is a concern, since the single line demo scripts make use of the docker-compose YAML files to run containers.
These pulls images from the GHCR registry using the tag name available in .env file.
Now this tag name will still refer to the image without the latest changes from the PR.
This is because as specified above, the image will only be pushed if the trigger event is not a PR.

Summarized concern

Hence, the demo scripts tests will actually run on the current latest tagged image but not on the latest code as it has no relevant tag right now since this code isn't yet available in an image that has been built and pushed to GHCR.

What should happen for latest image with latest code from PR to be tested?

So in the current scenario, for the tests to run on the latest code, the image must be pushed once and tags updated before running e2etest.yaml workflow.

Observations from actions history

I’ve even looked at the GitHub actions workflow runs history and can verify that for PRs, the push value is set to false in the docker-build-push action.

Run docker/build-push-action@v5
  with:
    context: ./manager
    push: false
    tags: [ghcr.io/everest/everest-demo/manager:0.0.16](http://ghcr.io/everest/everest-demo/manager:0.0.16)
  [ghcr.io/everest/everest-demo/manager:latest](http://ghcr.io/everest/everest-demo/manager:latest)

This results in images being built but not pushed to the GHCR as can be observed in the Build and Push step in the cicd.yaml workflow

WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load

This is observed even for PRs that triggered workflow multiple times since multiple pushes were made:

shankari commented 2 weeks ago

@MukuFlash03

First, I don't want a giant portmanteau PR that is hard to review and merge. This PR should do one thing and do it well, which is enabling automated tests. The demo-iso15118-2-ac-plus-ocpp.sh script should be enabled, but in a separate PR.

High level comments:

Essentially, we want to run the automated tests on every rebuild of the image so we know if the images are good enough to merge.

This is only partially correct. We do want to test out new images, but not all changes are to the images. Some changes are to the demo scripts as well. Note that the PR that required multiple rounds of testing from @louisg1337 did not make any changes to the images.

Which version of the bash script parameters to use?

Obviously the most recent one. Shankari's PR was started before the script was refactored. It wasn't changed since then. We should use the currently valid version of the script. However, we are going to focus on only the automated testing script for now, so this is irrelevant for this PR.

Found a related commit here that highlights this issue with the main reason for using timeout being that the docker compose scripts run without the -d flag and hence the process stays active in the terminal.

The challenge with using -d is that then we won't see the logs - the container runs completely in the background. So if the test fails, we won't be able to see why. It also means that the action/job will not fail if the test fails. The only way to check whether the test actually passed or failed is to look at the logs, which is not how tests are supposed to work. I would suggest using https://stackoverflow.com/questions/33799885/how-to-stop-all-containers-when-one-container-stops-with-docker-compose to ensure that the action sees the exit code from the test. Please also indicate the testing done, including a success and failure test case (you can change the entrypoint temporarily for this testing).

Should all the command versions run in the workflow? But these scripts run docker containers without -d flag which causes them to stay up and running. Even if -d flag was supplied, they might be using the same port so would need to stop the containers before executing other scripts.

I am not sure what you mean by this. each separate run in the matrix runs in a separate VM. So I am not sure what you mean by the 'same port"

I get that the requirement is to run the automated tests on every rebuild of the images so that we know if the changes are good to run.

As I said above, this is necessary but not sufficient

However, in the cicd.yaml file, I see that the latest images are only pushed to the GHCR registry on a non-pull-request event.

This is expected, since we only want to push valid images to the repo.

So in the current scenario, for the tests to run on the latest code, the image must be pushed once and tags updated before running e2etest.yaml workflow.

That is not the only option. A better option is to test the locally built image

We don't need to rebuild the image if the only change is to the demo files. In this case, it is fine to run the demo script directly.
If there are changes to the non-demo files, we do need to rebuild the images. And we do. We just don't push them. We should be able to test against the locally built images from the previous jobs. Please read up on docker images pull and cache - similar to git repos, if you don't explicitly specify pull, docker will only pull the remote image if it does not exist locally. We have just built these images as part of the github action, so when you run the tests, if you do so in the same environment, they will use the newly built versions.

shankari commented 2 weeks ago

Finally,

I am not sure why this change was introduced:

Removed macOS since docker not pre-installed: commit link Switched to macOS completely; removed ubuntu: commit link

when the goal is

Another requirement is that we need to run tests on different operating systems, hence the GitHub actions workflow must run on atleast ubuntu (for now) and MacOS later on.

Note also that, given the time and resources required to run MacOS, I don't think we should run it on every pull request. The idea behind containerization/software based testing is that if the code works in one container, it should work in all containers. This is clearly not true 100% of the time, but if it works in one setup, we know that the application logic is right, and we just have to figure out the system-level library incompatibilities.

MukuFlash03 commented 2 weeks ago

Finally,

I am not sure why this change was introduced:

Removed macOS since docker not pre-installed: commit link Switched to macOS completely; removed ubuntu: commit link

when the goal is

Another requirement is that we need to run tests on different operating systems, hence the GitHub actions workflow must run on atleast ubuntu (for now) and MacOS later on.

Just to clarify, these are not changes that I introduced. These were commits that I found in this PR by Shankari and I had just listed the relevant commits that could help better understand the task at hand.

Note also that, given the time and resources required to run MacOS, I don't think we should run it on every pull request. The idea behind containerization/software based testing is that if the code works in one container, it should work in all containers. This is clearly not true 100% of the time, but if it works in one setup, we know that the application logic is right, and we just have to figure out the system-level library incompatibilities.

Alright, noted.

MukuFlash03 commented 1 week ago

Based on the previous conversations, I tried a few approaches and listing them down along with reasons why I didn't use a specific approach and finally stuck with one.

Notes:

This blog post helped structure my thoughts to build a decent workflow.
Removed e2etest.yaml file as it didn’t seem relevant to run a separate workflow after cicd.yaml.
One of the primary tasks was to build the locally and then test it.

Finally decided to go ahead with Approach 3.

Approach 1:

Added a 2nd job to run automated tests after 1st job that builds and pushes docker images in cicd.yaml.
Now while this seemed good initially, I found some concerns with these:
- In case of a PR event, it’s okay to have the build + push first, then test.
- This is because since push isn’t executed, testing will run on the built image.
- But in case of non-PR or push event when the PR is merged, the push condition evaluates to true and if test is done after pushing image, the test may fail.
- And the erroneous image would have already been pushed to the registry as a part of the prior build + push action step.

Hence not moving ahead with this approach.

Approach 2:

Revised workflow jobs to follow this: Build -> Test -> Build + Push (existing)
The first build job would build image with the push attribute set to false as we don’t want to push even in case of a non-PR event before we test the image.
The image is then tested in the 2nd job that runs the automated tests.
Then proceed to Build + Push that uses the cache.

The problem with this was that:

The docker compose command was unable to retrieve the cached image layers that were built in the first Build job.
This was because the docker build push actions were caching to type=gha, the github actions cache.
Whereas, the docker compose would just pull the image from the repository directory and wasn't able to fetch cached layers.
Also, we cannot share images between jobs unless we use something like artifacts which would be inefficient since some images (like manager) are large in size.
Additionally, this approach resulted in a lot of duplicated code for the 1st Build job and 3rd Build + Push job, the only difference being that in the first job I had explicitly set push to false.

So adding just another build job with duplicated code from the 3rd job for Build + Push wasn’t that useful

Approach 3:

Revised workflow: Build + Test -> Build + Push
Now, the 1st job is actually the run-automated-tests job that uses the bash script to build the image locally using the docker compose commands.
Had to add the build and pull_policy attributes to the docker compose services.
Also added abort-on-exit and exit-code flags that help in killing the running docker compose up command once any container exits.
Additionally, if non-zero exit code is returned, the entire workflow fails before 2nd job that builds and pushes image.
Hence we can be assured that latest image is tested first before being pushed.

Notes on: How I am Building the image locally now

In the demo-automated-testing.sh script (commit), I had to change the docker compose command to use the docker-compose.automated-tests.yml directly via $DEMO_COMPOSE_FILE_NAME instead of $DEMO_DIR/DEMO_COMPOSE_FILE_NAME.
The reason was that docker compose file in the temporary directory was unable to find the build context Dockerfile since the temporary directory only had the docker compose file. Thus it was always pulling the image instead of building locally even after specifying the build attribute in the services in the docker compose file.
Hence, by removing the directory prefix to the file path, the docker compose present in the repository's current path is used where it can access the build context locations for the services (manager/ , /mosquitto)

Exit Code

Also tested by manually passing a non-zero exit code in this commit, and workflow run did fail, which is good.

MukuFlash03 commented 1 week ago

Testing Done in my Forked Repo

Done by Triggering Workflow Runs for a Success and Failure Test Case

I would suggest using https://stackoverflow.com/questions/33799885/how-to-stop-all-containers-when-one-container-stops-with-docker-compose to ensure that the action sees the exit code from the test. Please also indicate the testing done, including a success and failure test case (you can change the entrypoint temporarily for this testing).

Temporarily changed entrypoint in docker-compose.automated-tests.yml for the purposes of this testing

- entrypoint: "sh ./run-test.sh"
+ entrypoint: "sh ./test-exit-code.sh success"

Similarly triggered a separate run for failure:

+ entrypoint: "sh ./test-exit-code.sh failure"

Test script used for testing

#!/bin/bash
TEST_TYPE=$1
exit_code=0
if [ "$TEST_TYPE" == "success" ]; then
    exit_code=0
elif [ "$TEST_TYPE" == "failure" ]; then
    exit_code=1
fi

echo "Exit Code from test-exit-code.sh: $exit_code"
exit $exit_code

Success Test Case:

commit: passing success as a parameter to the entrypoint script.
workflow run
workflow continues successfully after the run-automated-tests job and moves onto complete the following jobs.

Failure Test Case

commit: passing failure as a parameter to the entrypoint script.
workflow run
workflow fails immediately as soon as non-zero exit code is received in the manager service and skips all jobs.

shankari commented 3 hours ago

In the demo-automated-testing.sh script (commit), I had to change the docker compose command to use the docker-compose.automated-tests.yml directly via $DEMO_COMPOSE_FILE_NAME instead of $DEMO_DIR/DEMO_COMPOSE_FILE_NAME.

@MukuFlash03 This is not correct. I am not sure why you are running the demo-XXX script in the first place. The demo scripts are designed to be "single line demos" that people can run without having to check out any code. They essentially check out the repo in a separate temporary directory and then run docker. But in the case of GitHub actions, we have already checked out the repo. It is not clear why you would run the demo script (which is a lightweight wrapper) and then edit it to remove its original function instead of just running docker compose.

EVerest / everest-demo