Open HofmeisterAn opened 8 months ago
Would it be worthwhile to refactor modules so that each image or class of images (e.g. Web/Database/... Servers that might want to share a common interface) has its own repository?
That would mean that exactly one docker image should be created for each image-specific repository, presuming that the same configuration used from multiple implementations in different languages results in an identical container being tested.
We could also describe the image-specific configuration in a common configuration format and use that to generate the language-specific code.
For example, username/password/... is often configured via environment variables, and the specific environment variable names are specific to individual images. However, the code for capturing the username/password is shared between multiple images across multiple languages, which suggests we have too much nearly identical custom-maintained code and are waiting for a refactoring pass to extract the common functionality.
Hi, I think this can help to free some more space for linux workers https://github.com/jlumbroso/free-disk-space
Hi EddĂş, we are doing this already:
Would it be worthwhile to refactor modules so that each image or class of images (e.g. Web/Database/... Servers that might want to share a common interface) has its own repository?
From past experiences with tc-java, this leads to a lot of operational overhead (e.g. releases), unless we have good automation in place.
Splitting in parallelizing the CI workflow on a per-module or module-sets level can help and it is what we essentially do in tc-java as well.
As recommended by @kiview I took a look at the way test containers Java is doing the parallel execution and would like to support in this regard whenever I got some time to work on it.
@HofmeisterAn what do you think about doing the parallel execution analogous to the Java project?
As recommended by @kiview I took a look at the way test containers Java is doing the parallel execution and would like to support in this regard whenever I got some time to work on it.
@HofmeisterAn what do you think about doing the parallel execution analogous to the Java project?
Definitely something we should try out 👍. We already have the implementation to merge coverage results from different runners (for Linux and Windows), so we probably just need to group (by core-linux, core-windows, modules), split and run the test projects accordingly. I'm not sure if we need to take a look at GitHub's usage limits. IIRC, we already had an issue with Dependabot once, where we created too many runners in a short period of time, which caused issues for other Testcontainers repos because they got degraded and couldn’t create new runners for a while.
Maybe we can also reuse the GitHub composite actions (.github/workflows/actions/tests/action.yml
) from this branch.
I began looking into this issue and prioritizing it to address the remaining module PRs afterward. So far, my initial tests look quite promising (diff). The max-parallel
configuration is helpful for adjusting the maximum number of runners to prevent degradation for the Testcontainers organization (maybe the configuration does not help here). There are a few more improvements we can make, such as building the entire project just once, but for now, I think this is already a good step forward and doesn't require too many changes.
The disadvantage is that we need to maintain a list of test projects in the GitHub workflow. Ideally, we can create this list somehow during runtime. At the very least, we need to ensure that no one forgets to add the test project here.
I was thinking about using the following PowerShell command/script to determine which runner belongs to which project. Instead of reading the file content, we can simply use its name. If the file does not exist, we can either throw an error or fall back to a default runner.
Get-ChildItem -Path 'tests' -Directory | % { @{ "name" = $_.Name; "runs-on" = [string](Get-Content -LiteralPath (Join-Path -Path $_.FullName -ChildPath ".runs-on") -ErrorAction SilentlyContinue) } }
The command create a JSON similar to:
[
{
"name": "Testcontainers.ActiveMq.Tests",
"runs-on": "ubuntu-22.04"
},
{
"name": "Testcontainers.ArangoDb.Tests",
"runs-on": "ubuntu-22.04"
},
{
"name": "Testcontainers.Azurite.Tests",
"runs-on": null
}
]
Problem
Hi đź‘‹, in the past few weeks, I've encountered several issues with the GitHub-hosted runners. Due to the increasing number of modules, pulling all the images occupies too much disk space. I've implemented a temporary fix to free up some disk space, but this won't be enough as we add more new modules.
Until we find a suitable fix and implement a mechanism to handle the growing number of modules and image sizes, I prefer not to merge pull requests that involve very simple module configurations. Which can be accomplished with just a few lines using the generic container builder API. If you believe the module is valuable for the community, please consider creating an issue first and discussing it there to avoid unnecessary work on creating a pull request. OC, I will review and try to merge the remaining and outstanding module PRs.
Solution
-
Benefit
-
Alternatives
-
Would you like to help contributing this enhancement?
Yes