Open tatilepizs opened 2 weeks ago
@dianabarsan, I would love to hear your thoughts about this idea. Do you think that the approach that I am describing is feasible? Is there something else that I should consider?
cc: @jkuester
I think I already had a branch that did this. I'll share.
Ok, it's not a branch, because you don't need new commands.
All you need to do is:
tests/cht-core-test.yml
and see the image tag. it's going to look like image: medicmobile/cht-haproxy:4.14.0-dev.1731328820240
. Take the tag and export it as VERSION. in my example:
export VERISON=4.14.0-dev.1731328820240
npm run wdio-local
you would run npm run ci-webdriver-default
@dianabarsan is there a compelling reason that we need to timestamp the image version? The main benefit I can think of is that it helps ensure we don't locally run the e2e tests against a stale version of the docker image. Is there something else I am missing?
The downsides of timestamping the image version include:
VERSION
envar. IMHO, re-running the e2e tests without rebuilding the images should be a workflow with top-level support. As a developer I am frequently re-running tests while writing them. It would be easier to script this if the image VERSION was deterministic (and not time-stamped). However, even if we keep the time-stamping, I still think it would be valuable to have an npm script that would programatically set the VERSION
when necessary.
re-running the e2e tests without rebuilding the images should be a workflow with top-level support
Sure, I agree.
If we neither timestamp the images NOR rebuild the images every time we run tests, then the risk of running tests over unknown images is really high. How do you even know what's in the image? Do you start the container and inspect it to see what changes are in it? Especially if you run multiple git worktrees and run tests in parallel.
This is one argument for timestamping, that if one uses different worktrees (like me), then you will have different images for them - which you can reuse with the VERSION env. Now, the worktrees don't necessarily need a timestamp (though this is most correct), we could use a hash of the local folder path instead to uniquely identify the reusable worktree images.
It's a shame there is no "only build if changed" docker command :(
I still think it would be valuable to have an npm script that would programatically set the VERSION when necessary.
What can be done is:
export VERSION=mytest
npm run integration-all-local
npm run ci-integration-all
npm run ci-integration-all
npm run ci-integration-all
npm run ci-integration-all
npm run ci-integration-all
This way you don't need to fish around get timestamps from generated docker compose files.
What if .... instead of timestamps we actually used the local checked out branch name? That way the version is deterministic, we can include it in npm scripts and it will also work with different worktrees or when working on multiple branches in parallel?
VERSION=$(git branch --show-current) npm run integration-all-local
VERSION=$(git branch --show-current) npm run ci-integration-all
And these can be our npm commands for local runs.
If we neither timestamp the images NOR rebuild the images every time we run tests, then the risk of running tests over unknown images is really high.
Agreed. My vote would be that the behavior of wdio-local
(and similar commands) remains the same, where the images are rebuilt for each run. Then, we can just use the ci-webdriver-default
commands to run the tests without rebuilding images (or add new commands if there is some reason the ci
ones will not work).
Ultimately it is up to the developer to be aware of the current state of their repo/images, but this seems to offer a good balance of a safe default happy path where everything gets built while allowing for a shortcut when someone knows what they are doing....
What if .... instead of timestamps we actually used the local checked out branch name?
This is my thought exactly! Specifically, my proposal is that we update the get-version.js
script code so that the default version value is the current git branch name. Then, we update the ci
test commands to include:
"ci-webdriver-default": "export VERSION=${VERSION:-$(node ./scripts/build/get-version.js)} wdio run ./tests/e2e/default/wdio.conf.js",
This way, when the VERSION
envar is provided to the ci
commands (e.g. by the GitHub actions) that value will be used. But, when there is no VERSION
, it will call the get-version
script and use the default value (aka the branch name).
@tatilepizs what are your thoughts here?
Thank you both for this discussion, I like the idea of using the branch name for the images, that would help a lot.
But, I was trying the steps that Diana suggested here and they are not working for me, after I run npm run ci-webdriver-default
it gets retrying and never run the tests, it just shows this message:
API check failed, trying again in 1 second
AggregateError
Checking API
Here is the video of the steps:
@dianabarsan is there something I am missing?
It looks like your docker is trying to pull the images and fails? Can you please post logs @tatilepizs ? It's easier to follow than a video. I wonder if this is related to docker desktop.
This is all I can see:
npm run ci-webdriver-default
> medic@4.14.0 ci-webdriver-default
> wdio run ./tests/e2e/default/wdio.conf.js
Execution of 1 workers started at 2024-11-13T22:51:30.751Z
(node:12628) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
Volume cht-e2e_cht-ssl Removing
Volume cht-e2e_cht-credentials Removing
Volume cht-e2e_cht-ssl Removed
Volume cht-e2e_cht-credentials Removed
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:32-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
nginx Pulling
api Pulling
couchdb-2.local Pulling
couchdb-3.local Pulling
couchdb-1.local Pulling
healthcheck Pulling
sentinel Pulling
haproxy Pulling
sentinel Error manifest for medicmobile/cht-sentinel:4.14.0-dev.1731538290620 not found: manifest unknown: manifest unknown
haproxy Error context canceled
nginx Error context canceled
healthcheck Error context canceled
couchdb-1.local Error context canceled
api Error context canceled
couchdb-2.local Error context canceled
couchdb-3.local Error context canceled
Error response from daemon: manifest for medicmobile/cht-sentinel:4.14.0-dev.1731538290620 not found: manifest unknown: manifest unknown
time="2024-11-13T16:51:34-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:34-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-13T16:51:34-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
Checking API
(node:12628) Warning: Setting the NODE_TLS_REJECT_UNAUTHORIZED environment variable to '0' makes TLS connections and HTTPS requests insecure by disabling certificate verification.
API check failed, trying again in 1 second
AggregateError
Checking API
API check failed, trying again in 1 second
AggregateError
Checking API
API check failed, trying again in 1 second
...
Oh, I think I saw what the problem was, in the video.
Your exported variable is "VERISON" not "VERSION" :)
Excerpt from the video.
Oh no π€¦ I am sorry about this. Thanks Diana
Did it work without the typo?
Yes, it didn't download the images, thank you π
I was thinking if there is a way to leave the containers up and running the first time, so the next time, it just uses the containers that are already up and doesn't need to create the containers again, that would be awesome because the tests will be running in just a couple of seconds
@tatilepizs I used to leave containers running for further testing, but @jkuester pushed a commit that killed them :)
https://github.com/medic/cht-core/commit/cf4a1dde1fe2cca28f201c65f1b70bcd2dda6f6b
π
If I cancel the process, I will have the containers up, but when I run the command npm run ci-webdriver-default
it will stop and remove all the containers and then create them again
npm run ci-webdriver-default
> medic@4.14.0 ci-webdriver-default
> wdio run ./tests/e2e/default/wdio.conf.js
Execution of 1 workers started at 2024-11-15T17:16:40.386Z
(node:91316) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
time="2024-11-15T11:16:42-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-15T11:16:42-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-15T11:16:42-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
Container cht-e2e-healthcheck-1 Stopping
Container cht-e2e-couchdb-2.local-1 Stopping
Container cht-e2e-sentinel-1 Stopping
Container cht-e2e-couchdb-3.local-1 Stopping
Container cht-e2e-couchdb-1.local-1 Stopping
Container cht-e2e-nginx-1 Stopping
Container cht-e2e-sentinel-1 Stopped
Container cht-e2e-sentinel-1 Removing
Container cht-e2e-sentinel-1 Removed
Container cht-e2e-healthcheck-1 Stopped
Container cht-e2e-healthcheck-1 Removing
Container cht-e2e-healthcheck-1 Removed
Container cht-e2e-couchdb-2.local-1 Stopped
Container cht-e2e-couchdb-2.local-1 Removing
Container cht-e2e-couchdb-2.local-1 Removed
Container cht-e2e-couchdb-1.local-1 Stopped
Container cht-e2e-couchdb-1.local-1 Removing
Container cht-e2e-couchdb-1.local-1 Removed
Container cht-e2e-couchdb-3.local-1 Stopped
Container cht-e2e-couchdb-3.local-1 Removing
Container cht-e2e-couchdb-3.local-1 Removed
Container cht-e2e-nginx-1 Stopped
Container cht-e2e-nginx-1 Removing
Container cht-e2e-nginx-1 Removed
Container cht-e2e-api-1 Stopping
Container cht-e2e-api-1 Stopped
Container cht-e2e-api-1 Removing
Container cht-e2e-api-1 Removed
Container cht-e2e-haproxy-1 Stopping
Container cht-e2e-haproxy-1 Stopped
Container cht-e2e-haproxy-1 Removing
Container cht-e2e-haproxy-1 Removed
Volume cht-e2e_cht-ssl Removing
Volume cht-e2e_cht-credentials Removing
Network cht-net-e2e Removing
Volume cht-e2e_cht-credentials Removed
Volume cht-e2e_cht-ssl Removed
Network cht-net-e2e Removed
time="2024-11-15T11:16:47-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-15T11:16:47-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
time="2024-11-15T11:16:47-06:00" level=warning msg="The \"COUCHDB_UUID\" variable is not set. Defaulting to a blank string."
Network cht-net-e2e Creating
Network cht-net-e2e Created
Volume "cht-e2e_cht-ssl" Creating
Volume "cht-e2e_cht-ssl" Created
Volume "cht-e2e_cht-credentials" Creating
Volume "cht-e2e_cht-credentials" Created
Container cht-e2e-haproxy-1 Creating
Container cht-e2e-couchdb-1.local-1 Creating
Container cht-e2e-healthcheck-1 Creating
Container cht-e2e-couchdb-2.local-1 Creating
Container cht-e2e-couchdb-3.local-1 Creating
Container cht-e2e-haproxy-1 Created
Container cht-e2e-sentinel-1 Creating
Container cht-e2e-api-1 Creating
Container cht-e2e-healthcheck-1 Created
Container cht-e2e-couchdb-2.local-1 Created
Container cht-e2e-couchdb-3.local-1 Created
Container cht-e2e-couchdb-1.local-1 Created
Container cht-e2e-api-1 Created
Container cht-e2e-nginx-1 Creating
Container cht-e2e-sentinel-1 Created
Container cht-e2e-nginx-1 Created
Container cht-e2e-couchdb-2.local-1 Starting
Container cht-e2e-couchdb-1.local-1 Starting
Container cht-e2e-healthcheck-1 Starting
Container cht-e2e-couchdb-3.local-1 Starting
Container cht-e2e-haproxy-1 Starting
Container cht-e2e-healthcheck-1 Started
Container cht-e2e-couchdb-2.local-1 Started
Container cht-e2e-couchdb-1.local-1 Started
Container cht-e2e-haproxy-1 Started
Container cht-e2e-sentinel-1 Starting
Container cht-e2e-api-1 Starting
Container cht-e2e-couchdb-3.local-1 Started
Container cht-e2e-sentinel-1 Started
Container cht-e2e-api-1 Started
Container cht-e2e-nginx-1 Starting
Container cht-e2e-nginx-1 Started
...
@jkuester pushed a commit that killed them :)
π π
I was thinking if there is a way to leave the containers up and running the first time, so the next time, it just uses the containers that are already up and doesn't need to create the containers again
This is where I would draw the line and say it is not a good idea to add first-class support for this workflow! π IMHO, you need to drop all the containers and reset things between test runs to better guarantee the reproducibility of test results. The docker images are static and cannot be affected by things that happen during a test run. But the containers are not! It might be possible (or likely?) that the state of the CHT instance in the containers is slightly different after a test run. Then subsequent test runs would not be running with the exact same environment as a fresh run.
π€ hmm, you are right Josh, I didn't think about that
@tatilepizs are you happy with the :
VERSION=$(git branch --show-current) npm run integration-all-local
VERSION=$(git branch --show-current) npm run ci-integration-all
solution?
Yes, I think it will be a great improvement to the way that we execute the tests
Describe the issue When we are creating new e2e tests or debugging the existing ones (and not modifying the
cht-core
code), it is really frustrating that every time we run the e2e tests we need to download and build completely the docker images. It takes around 2 minutes for a test to start running, which is a lot of time when we are debugging, for example, and just need to try small things in each iteration.Describe the improvement you'd like We would like to create two new commands, one that builds the docker images and one that only executes the wdio e2e tests.
The first command, the one that build the images, would be in charge of deleting always the images associated at the branch (if they exist) and then creating them. if we remove the
${buildTime}
from this line we can use the same images every time that we run the tests using the same branch, and by deleting the images at the beginning we would make sure that we are using the correct images, in case something changes in the cht-core code. The second command would be in charge of only executing the e2e tests, and since everything will be already built, it should just take seconds to start seeing the execution of the test.We will also leave the command
wdio-local
that would build and execute, just as it is right now.Describe alternatives you've considered