Open amumurst opened 3 years ago
I'm afraid this cannot be changed that easily, because it's intended behavior. If there are multiple tests running in a sequence, it is desired to start only a single database cluster (postgres instance) for all these tests. Typically, most of the tests share the same preparer configuration, so in the result there should be only a single or very limited number of background threads.
If your experience is different, please double check what implementation of DatabasePreparer
you are using and whether it has the equals
and hashCode
methods implemented properly. The default providers io.zonky.test.db.postgres.embedded.FlywayPreparer
and io.zonky.test.db.postgres.embedded.LiquibasePreparer
should be implemented correctly, so you can use them as reference.
To make it more clear, if I have the following test class and run it several times, it should always refer to the same database cluster. And only if I change the migration location parameter (preparer configuration), it will use a new database cluster.
public class FlywayPreparerTest {
@Rule
public PreparedDbRule db = EmbeddedPostgresRules.preparedDatabase(FlywayPreparer.forClasspathLocation("db/testing"));
@Test
public void test() throws Exception {
...
}
}
Okay so perhaps I can clarify further.
I added the basics for running exactly the LiquibasePreparerTest
already found in this repo from sbt here https://github.com/amumurst/embeddedpostgres-postgres-background-threads
As you see sbt is here ran as a program and not a script (as typically mvn or gradle would be). This makes it not start a new JVM for every test run. Running ps aux | grep '[p]ostgres' -c
to count the amount of postgres processes running between each test, we can see that it increases by 7 for every run.
(There is an option for sbt to fork a new JVM for each testrun, but that would mean much slower testing).
7 each time might not sound like much, but if I run ~test
aka. incremental testing on codechange it quickly becomes a lot and typically eventually OOMs.
Right now the Thread internally in PrepPipeline creates databases in the nextDatabase
queue until the JVM is shutdown and exposes no way of shutting it down. I am not sure why it seems to create 7 processes exactly every time though.
So what I am basically looking for is some method for shutting down these processes, and I thought killing the internal thread in PrepPipeline would be the way.
Okay, so I attempted to do this with a hack (finding all threads containing cluster
in the running JVM and interruputing them), but it does not remove the postgres processes.
What I found is that these postgres processes are contained in from PrepPipeline::pg
. This is of type EmbeddedPostgres
and implements Closeable
, but since PrepPipeline
is never closed, the pg
variable is also never shut down.
If I run an even worse hack with manually finding and running the shutdown hoooks added by EmbeddedPostgres::newCloserThread
in the function EmbeddedPostgres::startPostmaster
the postgres threads are removed.
val field = Class.forName("java.lang.ApplicationShutdownHooks").getDeclaredField("hooks")
field.setAccessible(true)
val hooks =field.get(null)
val threads: List[Thread] = hooks.asInstanceOf[java.util.Map[Thread, Thread]].asScala.keySet.toList
val postgresThreads = threads.filter(thread => thread.getName.contains("postgres") && thread.getName.contains("closer"))
postgresThreads.foreach(thread => Runtime.getRuntime.removeShutdownHook(thread))
postgresThreads.foreach(thread => thread.run())
@amumurst Thanks a lot for the reproducer, it made it a lot easier for me to investigate the problem.
I have confirmed that the SBT server is indeed using only a single JVM process. However, the problem is that the server uses a custom class loader, which creates the same classes over and over again each time the project is run. So within the same JVM process, there are then duplicate classes that have no relation to each other. This makes a caching mechanism, which is used to store instances of the embedded databases, impossible to work properly. Because the caching mechanism depends on a static variable, and in this case the static variable refers to a new object every time the project is run.
So to solve this problem, we need to figure out how to implement a cache that persists data between multiple runs of the project that is run via the SBT server. Or alternatively, implement a special shutdown hook that will be called when the project is shutting down and not when the SBT server (the JVM process) is shutting down.
Make PreparedDbProvider AutoCloseable such that the internal background thread started in PrepPipeline can be shut down without needing to shut down the entire JVM.
This is useful for usage in build tools that run in interactive mode, eg scalas sbt. Here you typically run
sbt
then a set oftest
->test
->test
(running tests multiple times without shutting down the underlying JVM in between. The current implementation then keeps multiple versions of preparers running in the background since the daemon thread is not killed.I am not too familiar with the internals here so there might be more to clean up, but I think this would work.