Optional possibility to reuse existing containers

AnkBurov commented 6 years ago

Hi. Currently there is a case when Testcontainers isn't that great. I talk about complicated integration testing environment where there are many containers that are launched during tests. Current architecture of Testcontainers creates needed containers, starts them and destroys them after tests. It's perfectly OK when your tests use one, two, maybe three containers but after that number time to run single test on developer's computer just gets too high.

One of projects I work on has rather complicated integration testing environment: two Cassandra containers, Maria container, Splunk container and several containers with some applications. With Testcontainers creating and destroying all these containers around test execution (one test or suite, doesn't matter) takes ages to execute. Currently the project uses some Groovy scripts trying to find existing containers and use them instead of always creating new ones, but overall experience with this solution isn't good and I'd like to bring Testcontainers into that project. Which has aforementioned time consuming problems with current create-and-destroy approach in complex environments.

So I thought - why don't bring to Testcontainers optional behavior of reusing existing containers (stopped or even running)? In complicated environments it will greatly boost test startup time to seconds - Testcontainers would only need to scan existing containers and if one is found and running - only apply initialScript from PRs https://github.com/testcontainers/testcontainers-java/pull/776/files#diff-5b39417ccdbbe90d4dc26c69fba8fa8eR114 and https://github.com/testcontainers/testcontainers-java/pull/575 and container is good to go. If no suitable containers found, then simply create the one - existing Testcontainers behavior.

What do you think about this idea?

bsideup commented 6 years ago

Hi @AnkBurov

reusing existing containers

How do you detect previously started containers? We thought about it but there is no reliable way so far :)

bsideup commented 6 years ago

The code you linked is only about the initialization of the containers. That's not a problem.

What makes it hard is:

[ ] knowing that the container you discover is exactly the same as from the previous test. Something like hashing
[ ] differentiating them by project, so that one project will not accidentally reuse other project's containers
[ ] eventually stopping them to avoid dangling containers
[ ] code changes in the project (think adding or removing containers or changing their configuration)

One might say "hey, Docker Compose is doing it" but that's not 100% the same. Also, they identify the containers by file system path (directory's name) and it has some reliability issues.

There is an alternative long term approach - use Docker's freeze/unfreeze, so that we always start a new container but it is super fast because it is already initialized

MFomichev commented 6 years ago

Hi @bsideup Could you say a few words about "long term approach of using freeze/unfreeze"? Do you have any plans about it? Maybe there is any issue / thread which I can read (and I will be glad to provide some contribution)?

bsideup commented 6 years ago

Hi @MFomichev,

See this document: https://github.com/docker/cli/blob/010340e304bdb11ca393606b6d50e7b6bd5e7010/experimental/checkpoint-restore.md

Unfortunately it is not stable yet and not widely available, but we're keeping our eye on it and most probably will integrate into Testcontainers as soon as the feature is released

AnkBurov commented 6 years ago

Hi @bsideup

How do you detect previously started containers? We thought about it but there is no reliable way so far :)

By using method com.github.dockerjava.api.DockerClient.listContainersCmd().withShowAll(true).exec() and filter out container with the wanted name. Each Container has getStatus() method that returns String. Just parse it whether it starts with Up or not and you'll know if container is running or not.

Another, more reliable approach - filter out all containers by name and if one is found, then do search without withShowAll(true) builder method and compare results - same you'll know if container is running or not. Intellij Idea's evaluator shows that this approach works. I'm at work now, so I can't provide the code snippet, but I think I described ways to detect previously started containers quite well.

The code you linked is only about the initialization of the containers.

Yes, I know. By posting those links I meant that state of already running container (started by previous Testcontainers invocation or by user) isn't an issue because with those features container that has state (database container) can easily be reinitialized.

What makes it hard is:

Yes, I understand these concerns. What I meant is to provide a user little more control. What I propose is to user in Container builder configuration can specify name of the container that Testcontainers will try to use and not killing it after all the tests.

Something like: genericContainer.withReuseExistingContainerConfiguration(Builder.withName("nameOfContainer").withEnabled(true/false)...)

That way Testcontainers will try to find container with name "nameOfContainer" and check if it can be used. If no container found - create new one with this name and not kill it after tests. If container is found but the image is unexpected - throw exception. If container is found - optionally reinitialize it and use it.

eventually stopping them to avoid dangling containers

It shouldn't stop with this option. Main purpose of this mode is to use it on developer's computer and once needed containers created and started only reinitialization by applying initScript is needed in the following tests invocation. In any case there will only be one container with name "nameOfContainer", so there is no OOM possibility.

AnkBurov commented 6 years ago

@MFomichev hi. Nice office in the headquarters btw :) but we didn't come to agreement on salary terms.

AnkBurov commented 6 years ago

@bsideup

How do you detect previously started containers? We thought about it but there is no reliable way so far :)

Here is more consistent approach:

dockerClient.listContainersCmd()
                        .withShowAll(true)
                        .exec()
                        .stream()
                        .filter(container -> Arrays.asList(container.getNames()).contains("containerName"))
                        .findFirst()
                        .map(container -> {
                            val isRunning = dockerClient.inspectContainerCmd(container.getId()).exec().getState().getRunning();
                            if (!isRunning) {
                                dockerClient.startContainerCmd(container.getId()).exec();
                            }
                            return container;
                        })
                        .orElseGet(() -> /*createContainer("containerName")*/)

Checking that image of found container is expected is omitted here for brevity sake.

bsideup commented 6 years ago

@AnkBurov Well, this code just lists the containers. I doesn't check that container actually matches the desired configuration or something.

Or maybe I'm missing something, than we would be more than happy to see a PR with your idea where we also can discuss it on the code level :)

bountin commented 6 years ago

IMO testcontainers would have to set custom container labels to identify reusable containers and not just rely on the container name.

I actually like the idea for local testing because the containers' IPs won't change with each test run and I'd have a persisted connection to e.g. an ES cluster or a database. OTOH, I wouldn't want to use it in the CI environment where parallel test runs should spawn separate container instances for isolated integration testing.

AnkBurov commented 6 years ago

@bsideup I've updated the post a little bit later to clarify that found container has desired image. Ok, I meant something like this:

val createdContainer = dockerClient.listContainersCmd()
                                               .withShowAll(true)
                                               .exec()
                                               .stream()
                                               .filter(container -> Arrays.asList(container.getNames()).contains("containerName"))
                                               .findFirst()
                                               .orElseGet(() -> /*createContainer("containerName")*/ );
            if (!createdContainer.getImage().equals("expectedImage/latest")) {
                logger().error("Found existing container with name {} has unexpected image {}", Arrays.toString(createdContainer.getNames()), createdContainer.getImage());
                throw new ContainerLaunchException("Unexpected image");
            }

            val isRunning = dockerClient.inspectContainerCmd(createdContainer.getId()).exec().getState().getRunning();
            if (!isRunning) {
                dockerClient.startContainerCmd(createdContainer.getId());
            }

            val startedContainer = createdContainer;

bsideup commented 6 years ago

@AnkBurov image is just one of tens of different parameters of the container we need to check to assume that it was started by TC, by the same project as the current one, with the same configuration, etc etc. I was doing a research about it, it's not that simple, trust me :)

AnkBurov commented 6 years ago

@bsideup I'm not trying to convince you that you're wrong, I'm trying myself to understand the situation out :)

that it was started by TC, by the same project as the current one, with the same configuration, etc etc

My point that TC shouldn't do this in this reuse existing container mode. It's up to developer to track that specified container is correct. TC should only check formal symptoms that found container is correct - check its image and optionally reinit it by applying developer written init script.

Btw, nice solution with Ryke and Death Note. Something like borderland between Death Note and Dead Hand system. :)

@bountin yes, I'm having the same point - this mode is only for developer computers to speed tests. On CI there should be current TC approach. Thats why I mentioned ability to disable reusing of existing containers - to control it by system variable in build script:

genericContainer.withReuseExistingContainerConfiguration(Builder.withName("nameOfContainer").withEnabled(true/false)...)

bsideup commented 6 years ago

@AnkBurov sorry if I sound aggressive / defensive, I didn't mean to :)

The think is that I we were thinking about such mode for past year already, but there are some many moving parts making it incredibly hard to implement in a way that will work for everyone.

TC should only check formal symptoms that found container is correct - check its image and optionally reinit it by applying developer written init script.

While it might (or not) work for some, that will defeat the message we promise to our users - consistent testing environments. If we just detect containers and use them instead, it will break many usages. To make it really work, one has to hash the CreateContainerCmd payload, store as a label, attach to the container and then, before starting a new container, look for an already started one with the same hash. But if the hash differs, we cannot use that container anymore (and it also has to be destroyed after some period of time) Post-start initializations also has to be stored in the container's state so that we know that the container was initialized and not just started.

And that's only a tip of the iceberg :) But happy to share more thoughts if you want :)

bsideup commented 6 years ago

@AnkBurov

Btw, nice solution with Ryke and Death Note

Thanks :) Something barely notable by the end user but what makes Testcontainers one of the best libs out there (talking as an end user ;) )

bsideup commented 6 years ago

.withEnabled(true/false)...)

Very environment specific, better to use ~/.testcontainers.properties and something like containers.reuse=true

AnkBurov commented 6 years ago

@bsideup OK, I understood you point. Currently I think about this optional mode of as an agreement with developer (TC user) 'OK, I understand the consequences, let me just use this container for testing'.

OK, I think I'll write some code in spare time and see will it work from my POV and if it does I'll do the PR to discuss things on the code.

Very environment specific, better to use ~/.testcontainers.properties and something like containers.reuse=true

It was just a concept - to clarify that mode is disable-able-ish (not a word though :smile: ) for different testing environments.

rnorth commented 6 years ago

I agree with @bsideup, especially with this bit:

While it might (or not) work for some, that will defeat the message we promise to our users - consistent testing environments.

I really think this is important to preserve... I don't want Testcontainers to just be a nicer docker client: I'd like to keep it as a model and set of tools for doing container-based testing well (with consistent environments being one of the most important principles).

Personally in the past when I've thought about doing something about the test duration problem, the two ideas I've felt the most happy with are:

checkpoint/restore (CRIU) or 'freeze/unfreeze'. The really frustrating thing is that this has been in the mainstream docker daemon for ages, but the kernel/libs support is not present in the boot2docker/moby OS. This seems very sad to me. If I had infinite time I'd pour effort into trying to change this, because CRIU looks so promising.
a 'pipelining' model, where new containers are spun up in advance of being needed (e.g. container N+1 is being spun up while test N runs). This has obvious resource usage downsides, but has some attractiveness in that it's reasonably simple.

Does this kinda make sense?

AnkBurov commented 6 years ago

@rnorth thanks for the explanation. Yes, this does make sense - you want TC as a reliable self-contained solution. That has time consuming problems in complex test environments.

checkpoint/restore (CRIU) or 'freeze/unfreeze'

Nice feature, but it's as you well said it's not available yet and I don't know when it will be available. And the problem I'm talking about is here and now for all complex testing environments using TC.

I think I'll write some code and make PR if I feel this feature is in OK state. If it won't fit into your view of TC architecture - I'll just continue to use fork as I do with not merged features (like Cassandra container until recent time and JdbcContainer.withInitScript feature) - no biggie.

Thanks for the discussion and explanations. :+1:

kiview commented 6 years ago

@AnkBurov Have you though about managing the container lifecycle manually with Testcontainers (meaning just calling start/stop manually)? We are doing it at work for our Acceptance-Test/System-Test setup and it works fine and allows drastically reduce the test duration if you can just keep the containers running and share them between tests. Would this work for your use case or do you really need new container instances?

@rnorth I was also just recently discussing this Testcontainers-Pool idea, could be a nice thing, but also very resource hungry.

About CRIU, I wasn't aware that we don't have this on Windows/Mac, sad 😞

AnkBurov commented 6 years ago

@kiview I do aware of this feature, I use it in some projects, but unfortunately this is not the case in this case. :smiley_cat:

The testing environment I talk about uses 7 containers and they all are needed. So running of single test in Intellij Idea starts 7 containers. After single test they will be stopped and destroyed. All these operations consume too much time :smile: With using reusable containers it would be matter of seconds minus time to reapply init scripts to stateful containers (Cassandras and Maria).

Hope I cleared the situation.

kiview commented 6 years ago

Oh, so you are even meaning surviving JVM process exit? This would be nice indeed, but super tricky to do right and transparently enough to be user friendly. But it's indeed something that could go into testcontainers.properties.

Eugeny Karpov notifications@github.com schrieb am Fr., 13. Juli 2018, 13:59:

@kiview https://github.com/kiview I do aware of this feature, I use it in some projects, but unfortunately this is not the case in this case. 😺

The testing environment I talk about uses 7 containers and they all are needed. So running of single test in Intellij Idea starts 7 containers. After single test they will be stopped and destroyed. All these operations consume too much time 😄 With using reusable containers it would be matter of seconds minus time to reapply init scripts to stateful containers (Cassandras and Maria).

Hope I cleared the situation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/testcontainers/testcontainers-java/issues/781#issuecomment-404812584, or mute the thread https://github.com/notifications/unsubscribe-auth/AE2jaMEIP4agu42v7JkvsgeekoMWxGzJks5uGIuogaJpZM4VMoll .

AnkBurov commented 6 years ago

@kiwiev

Oh, so you are even meaning surviving JVM process exit?

Exactly.

AnkBurov commented 6 years ago

@rnorth @bsideup @kiview I wrote some code to back up the idea of reusing existing containers. Please take a look in spare time. Thanks.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this is a mistake, please reply to this comment to keep it open. If there isn't one already, a PR to fix or at least reproduce the problem in a test case will always help us get back on track to tackle this.

rnorth commented 5 years ago

This remains a painful topic, in part because container startup time is intrinsically annoying, but also because I think a ~good~ perfect solution is just slightly out of our reach.

Re the current state of the options as I see them:

Leaving containers running between tests (as a feature of Testcontainers) is something I'm deeply uncomfortable with. One of my goals with Testcontainers was to make it easy to have ephemeral data stores for testing, specifically to avoid dirty state problems. I can't see any way to safely leave containers running without breaking this guarantee.
Checkpoint/restore (CRIU) looks extremely promising but is currently not getting any love from Docker (e.g. docker/for-mac#1059). I've even attempted some nasty hacks/experiments involving a CRIU-capable DinD daemon - this is actually tantalizingly close to working, but networking is unfortunately broken. Ultimately I fear Docker are just going to allow this potentially awesome feature to wither and die (🤞 they don't, though).
Maybe there's a halfway approach we could take - perhaps a filesystem-only snapshot, which would save time wrt. test data initialisation at least. If people would like this then I'd very happily implement it.
@kiview is currently using https://github.com/fastmonkeys/stellar on a project, which sounds like a very promising tool for SQL DB-based scenarios

aguibert commented 5 years ago

@rnorth Reuse-able containers is something I am very much interested in. I think the easiest solution to implement is reusing containers by letting them continue running in between tests. Regarding your concerns on the approach:

Leaving containers running between tests (as a feature of Testcontainers) is something I'm deeply uncomfortable with. One of my goals with Testcontainers was to make it easy to have ephemeral data stores for testing, specifically to avoid dirty state problems. I can't see any way to safely leave containers running without breaking this guarantee.

I agree, but we can solve this by making "leave it running" an opt-in approach. Maybe via GenericContainer.allowReuse(boolean). This achieves two things:

Lets the user decide if they want to reuse containers for the sake of performance. Maybe they are willing to write their own custom cleanup logic, or maybe they write tests in a way where dirty state on containers does not matter.
For containers where state doesn't matter, such as the Mockserver module, we can make the decision for the user on what the default reuse policy should be in the ctor. Of course, the user could always override this on their end too.

bsideup commented 5 years ago

@aguibert it seems that you just missed the announcement from 2 weeks ago :D

We do have a concrete plan of how to implement it now and are working on the prototype, you can even see some details on this sneak peak: https://twitter.com/bsideup/status/1127150207014264832

Thanks for your input tho, it's pretty aligned with what we plan to achieve, keep watching on the announcements, we plan to ship an experimental implementation in a near future (although I can't give any dates yet, sorry. The feature is much harder than how it sounds)

AnkBurov commented 5 years ago

@bsideup when the feature will be released could you post an update itt? Maybe it'll worth to sneak some code to my fork (or jump it off completely) :)

The feature is much harder than how it sounds)

One year with the implemented reusing feature and it still works great for stateful and stateless containers (although init scripts for stateful containers must be idempotent thus written carefully). The only thing I had to add is the customizable logic when a found container is having different image or imageId than expected from GenericContainer properties.

aguibert commented 5 years ago

@bsideup ah, I did see your tweet from 2 weeks ago, but it was pretty cryptic to me and so I went looking for the Github issue representing the work and landed here 😄

Glad that you are working on it. If you think that more hands would help the work go faster, let me know! I'm happy to chip in where I can on this feature.

If you're at the point where you think it's stable enough that users can kick the tires, would you consider cutting a X.BETA release on maven central perhaps?

bsideup commented 5 years ago

@aguibert we're huge fans of Jitpack, once we have a PR, everyone will be able to try it with Jitpack :)

bsideup commented 5 years ago

@AnkBurov

when the feature will be released could you post an update it

Will do

One year with the implemented reusing feature

Sorry, but I would like to correct you here: the feature is not implemented yet. I assume you were talking about your PR #786 which has covered a very small subset of the functionality.

The only thing I had to add is the customizable logic when a found container is having different image or imageId than expected from GenericContainer properties.

And that's just the tip of an iceberg :)

The current prototype:

1) hashes the containers and detects changes 2) runs a sidecar and destroys containers after TLL (unlike Docker Compose), so that you only run containers that you reuse often (think ITDD where I stands for "Integration") 3) implements a global locking, so that two processes will not reuse the same container 4) enforces 3 levels of "protection" - reusability must be defined in the container class (false by default), needs to be explicitly enabled by the consumer (with a flag) and globally turned on per environment (via ~/.testcontainers.properties) to avoid using it on CI

mdindoffer commented 5 years ago

@bsideup Hi, thanks for the hard work. Are you still working on the feature? Do you have any ETA? I'd love to do some beta testing when ready.

bsideup commented 5 years ago

Hi @mdindoffer,

Yes, we're still working on this feature, but, given the complexity, we cannot ship it as fast as even we want :)

Some guesstimates from my head:

PR submitted end of summer (that's the point when beta testing will start making sense, with Jitpack) Experimental release in ~ September-October GA end of 2019

These are the pessimistic ones, but we have to be pessimistic about it because the development is being done on our free time, best-efforts basis :)

However, you can help today by thinking about the following questions, because are kinda the prerequisites:

Check whether your code can use Singleton containers or at least ClassRule-based. Reusable containers are by design singleton containers for your app, because they will be created once and last even longer than the JVM's lifespan. If you start/stop containers during the tests (e.g. with @Rule, or even @ClassRule), they may not fit you
Are your tests isolated enough? Will they fail if you re-run them over the same instance of the DB, for instance?
Do you randomize table's and db's names, kafka topics, etc? Or, are you prepared for cleaning them before the tests?
If you have multiple tests running in parallel, should they use the same container (when it matches the "hash", and the uniqueness of DBs/Tables/Topics as per point number 3) or each run should have an exclusive instance of a container?

guenhter commented 5 years ago

@bsideup I'd really like to see this feature. Can I/we somehow support you here? Maybe implementing a few areas under your guidance?

bsideup commented 5 years ago

@guenhter thanks for offering help! At this stage it is a big hard to split the work, but we will consider splitting it after the base parts are implemented 👍

guenhter commented 5 years ago

Understood. When it's possible to help out here, just say a word...

AnkBurov commented 4 years ago

Finally we have an upstream implementation of reusing containers! Its time to jump off from the fork.

AnkBurov commented 4 years ago

@bsideup thanks! Closing the issue now.

vilkg commented 4 years ago

So glad you fixed this issue! May I ask when is it going to be released?

bsideup commented 4 years ago

@vilkg the first iteration is released in 1.12.3, see #1781

testcontainers / testcontainers-java

Optional possibility to reuse existing containers #781