spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
56.47k stars 38.09k forks source link

Error with checkpoint on refresh when using a remote data source #32378

Closed tjwatson closed 7 months ago

tjwatson commented 7 months ago

Apologize for the length of this write up.

Maybe related to #32299 but I believe I have the options all set correctly to do the checkpoint without having hibernate contact the remote database before checkpoint as desribed in https://github.com/spring-projects/spring-lifecycle-smoke-tests/tree/main/data/data-jpa.

Reproduction repo: https://github.com/tjwatson/crac-db-issue/tree/checkpointIssue

This is a modified version of https://github.com/OlgaMaciaszek/crac-db-sample

All scripts in here default to podman but you can pass docker in if you want to use Docker instead

First build the Spring Boot application with: ./mvnbuild.sh docker

A docker compose file docker-compose-checkpoint.yml sets up the run for doing the checkpoint in container. This docker compose file sets the following to attempt to avoid connecting to the remote database before checkpoint by setting the environment values:

      SPRING_JPA_PROPERTIES_HIBERNATE_TEMP_USE_JDBC_METADATA_DEFAULTS: 'false'
      SPRING_JPA_HIBERNATE_DDLAUTO: none
      SPRING_SQL_INIT_MODE: never

Initially this file has the config FLAG: '-r' commented out. This flag is used to enable doing checkpoint on refresh. This docker compose file also has the db service commented out to demonstrate what happens at container image build when there is no database. Run the following to do the checkpoint (if using podman you must run as root or with sudo):

./checkpoint.sh docker

The following output should show the spring logs from the checkpoint:

Doing checkpoint after app has started for 30 seconds

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v3.2.2)

2024-03-05T19:31:09.519Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : Starting CracDbSampleApplication v0.0.1-SNAPSHOT using Java 17.0.10 with PID 129 (/opt/app/crac-db-sample-0.0.1-SNAPSHOT.jar started by root in /)
2024-03-05T19:31:09.528Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : No active profile set, falling back to 1 default profile: "default"
2024-03-05T19:31:13.609Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-03-05T19:31:14.121Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 487 ms. Found 1 JPA repository interface.
2024-03-05T19:31:14.733Z  INFO 129 --- [context-demo] [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=d9cd0c48-d530-30d4-bc90-e87b806e4370
2024-03-05T19:31:16.878Z  INFO 129 --- [context-demo] [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2024-03-05T19:31:17.086Z  INFO 129 --- [context-demo] [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 6.4.1.Final
2024-03-05T19:31:17.257Z  INFO 129 --- [context-demo] [           main] o.h.c.internal.RegionFactoryInitiator    : HHH000026: Second-level cache disabled
2024-03-05T19:31:18.145Z  INFO 129 --- [context-demo] [           main] o.s.o.j.p.SpringPersistenceUnitInfo      : No LoadTimeWeaver setup: ignoring JPA class transformer
2024-03-05T19:31:18.243Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2024-03-05T19:31:19.452Z ERROR 129 --- [context-demo] [           main] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Exception during pool initialization.

org.postgresql.util.PSQLException: The connection attempt failed.
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:354) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:54) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:263) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.Driver.makeConnection(Driver.java:443) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.Driver.connect(Driver.java:297) ~[postgresql-42.6.0.jar!/:42.6.0]
...
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at com.example.cracdbsample.CracDbSampleApplication.main(CracDbSampleApplication.java:16) ~[!/:0.0.1-SNAPSHOT]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
Caused by: java.net.UnknownHostException: db
    at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:578) ~[na:na]
    at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) ~[na:na]
    at java.base/java.net.Socket.connect(Socket.java:641) ~[na:na]
    at org.postgresql.core.PGStream.createSocket(PGStream.java:243) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.PGStream.<init>(PGStream.java:98) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:132) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:258) ~[postgresql-42.6.0.jar!/:42.6.0]
    ... 71 common frames omitted

2024-03-05T19:31:19.485Z  WARN 129 --- [context-demo] [           main] o.h.e.j.e.i.JdbcEnvironmentInitiator     : HHH000342: Could not obtain connection to query metadata

java.lang.NullPointerException: Cannot invoke "org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(java.sql.SQLException, String)" because the return value of "org.hibernate.resource.transaction.backend.jdbc.internal.JdbcIsolationDelegate.sqlExceptionHelper()" is null
    at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcIsolationDelegate.delegateWork(JdbcIsolationDelegate.java:116) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.getJdbcEnvironmentUsingJdbcMetadata(JdbcEnvironmentInitiator.java:276) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:107) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:68) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
...
    at com.example.cracdbsample.CracDbSampleApplication.main(CracDbSampleApplication.java:16) ~[!/:0.0.1-SNAPSHOT]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]

2024-03-05T19:31:19.552Z  WARN 129 --- [context-demo] [           main] org.hibernate.orm.deprecation            : HHH90000025: PostgreSQLDialect does not need to be specified explicitly using 'hibernate.dialect' (remove the property setting and it will be selected by default)
2024-03-05T19:31:22.144Z  INFO 129 --- [context-demo] [           main] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000489: No JTA platform available (set 'hibernate.transaction.jta.platform' to enable JTA platform integration)
2024-03-05T19:31:22.157Z  INFO 129 --- [context-demo] [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2024-03-05T19:31:24.659Z  INFO 129 --- [context-demo] [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 2 endpoint(s) beneath base path '/actuator'
2024-03-05T19:31:26.204Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Registering JVM checkpoint/restore callback for Spring-managed lifecycle beans
2024-03-05T19:31:26.746Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase -2147483647
2024-03-05T19:31:26.748Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'springBootLoggingLifecycle'
2024-03-05T19:31:26.749Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147481599
2024-03-05T19:31:27.145Z  INFO 129 --- [context-demo] [           main] o.s.b.web.embedded.netty.NettyWebServer  : Netty started on port 8080
2024-03-05T19:31:27.155Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerStartStop'
2024-03-05T19:31:27.155Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147482623
2024-03-05T19:31:27.156Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerGracefulShutdown'
2024-03-05T19:31:27.156Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147483647
2024-03-05T19:31:27.194Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : Started CracDbSampleApplication in 19.647 seconds (process running for 21.461)
Not creating users.
129:
2024-03-05T19:31:36.336Z  INFO 129 --- [context-demo] [Attach Listener] jdk.crac                                 : Starting checkpoint
2024-03-05T19:31:36.345Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping Spring-managed lifecycle beans before JVM checkpoint
2024-03-05T19:31:36.351Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase 2147483647
2024-03-05T19:31:36.353Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Bean 'applicationTaskExecutor' completed its stop procedure
2024-03-05T19:31:36.354Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase 2147482623
2024-03-05T19:31:36.357Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Bean 'webServerGracefulShutdown' completed its stop procedure
2024-03-05T19:31:36.357Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase 2147481599
2024-03-05T19:31:36.367Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Bean 'webServerStartStop' completed its stop procedure
2024-03-05T19:31:36.369Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase 0
2024-03-05T19:31:38.412Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully stopped bean 'reactorResourceFactory'
2024-03-05T19:31:38.414Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully stopped bean 'refreshScopeLifecycle'
2024-03-05T19:31:38.415Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Stopping beans in phase -2147483647
2024-03-05T19:31:38.416Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Bean 'springBootLoggingLifecycle' completed its stop procedure
CR: Checkpoint ...

Notice that there are errors trying to connect to db because there is no database running in the deployment during checkpoint. But the application proceeds to start anyway because the application.yml has set:

spring:
  datasource:
    continue-on-error: true

This does appear to restore fine if you run the ./restore-start.sh docker command. This uses a separate docker compose file docker-compose-restore.yml which does enable the db service for the application to connect to. Running this script you see output like this:

+ '[' 1 -lt 1 ']'
+ CONTAINER_TOOL=docker
+ echo 'Using docker'
Using docker
+ docker-compose -f docker-compose-restore.yml up -d
Creating network "crac-db-issue_default" with the default driver
Creating crac-db-sample-restore ... done
Creating db                     ... done
+ docker logs -f --tail -1 crac-db-sample-restore
2024-03-05T19:52:09.164Z  INFO 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Restarting Spring-managed lifecycle beans after JVM restore
2024-03-05T19:52:09.195Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase -2147483647
2024-03-05T19:52:09.197Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'springBootLoggingLifecycle'
2024-03-05T19:52:09.197Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 0
2024-03-05T19:52:09.200Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'reactorResourceFactory'
2024-03-05T19:52:09.202Z  INFO 129 --- [context-demo] [Attach Listener] o.s.c.c.refresh.RefreshScopeLifecycle    : Refreshing context on restart.
2024-03-05T19:52:10.070Z  INFO 129 --- [context-demo] [Attach Listener] o.s.c.c.refresh.RefreshScopeLifecycle    : Refreshed keys: [spring.cloud.client.hostname, spring.cloud.client.ip-address]
2024-03-05T19:52:10.072Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'refreshScopeLifecycle'
2024-03-05T19:52:10.074Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147481599
2024-03-05T19:52:10.100Z  INFO 129 --- [context-demo] [Attach Listener] o.s.b.web.embedded.netty.NettyWebServer  : Netty started on port 8080
2024-03-05T19:52:10.115Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerStartStop'
2024-03-05T19:52:10.115Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147482623
2024-03-05T19:52:10.116Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerGracefulShutdown'
2024-03-05T19:52:10.116Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147483647
2024-03-05T19:52:10.118Z DEBUG 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'applicationTaskExecutor'
2024-03-05T19:52:10.130Z  INFO 129 --- [context-demo] [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Spring-managed lifecycle restart completed (restored JVM running for 1253 ms)

And this does appear to successfully connect to the db service if you browse http://localhost:8080/users (although the db service doesn't have an initialized database so you get errors like ERROR: relation "users" does not exist but that is expected. Be sure to run ./restore-stop.sh docker to stop the services for the restore run.

To try and avoid the connection to the database before the checkpoint there is the FLAG option that can be enabled in docker-compose-checkpoint.yml by uncommenting the line with FLAG: '-r'. If you then run ./checkpoint.sh docker again you will see the following:

Doing checkpoint onRefresh

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v3.2.2)

2024-03-05T20:03:39.490Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : Starting CracDbSampleApplication v0.0.1-SNAPSHOT using Java 17.0.10 with PID 129 (/opt/app/crac-db-sample-0.0.1-SNAPSHOT.jar started by root in /)
2024-03-05T20:03:39.501Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : No active profile set, falling back to 1 default profile: "default"
2024-03-05T20:03:43.911Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-03-05T20:03:44.514Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 558 ms. Found 1 JPA repository interface.
2024-03-05T20:03:45.300Z  INFO 129 --- [context-demo] [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=d9cd0c48-d530-30d4-bc90-e87b806e4370
2024-03-05T20:03:47.418Z  INFO 129 --- [context-demo] [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2024-03-05T20:03:47.588Z  INFO 129 --- [context-demo] [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 6.4.1.Final
2024-03-05T20:03:47.711Z  INFO 129 --- [context-demo] [           main] o.h.c.internal.RegionFactoryInitiator    : HHH000026: Second-level cache disabled
2024-03-05T20:03:48.495Z  INFO 129 --- [context-demo] [           main] o.s.o.j.p.SpringPersistenceUnitInfo      : No LoadTimeWeaver setup: ignoring JPA class transformer
2024-03-05T20:03:48.564Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2024-03-05T20:03:49.736Z ERROR 129 --- [context-demo] [           main] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Exception during pool initialization.

org.postgresql.util.PSQLException: The connection attempt failed.
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:354) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:54) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:263) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.Driver.makeConnection(Driver.java:443) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.Driver.connect(Driver.java:297) ~[postgresql-42.6.0.jar!/:42.6.0]
...
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at com.example.cracdbsample.CracDbSampleApplication.main(CracDbSampleApplication.java:16) ~[!/:0.0.1-SNAPSHOT]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
Caused by: java.net.UnknownHostException: db
    at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:578) ~[na:na]
    at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) ~[na:na]
    at java.base/java.net.Socket.connect(Socket.java:641) ~[na:na]
    at org.postgresql.core.PGStream.createSocket(PGStream.java:243) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.PGStream.<init>(PGStream.java:98) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:132) ~[postgresql-42.6.0.jar!/:42.6.0]
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:258) ~[postgresql-42.6.0.jar!/:42.6.0]
    ... 71 common frames omitted

2024-03-05T20:03:49.780Z  WARN 129 --- [context-demo] [           main] o.h.e.j.e.i.JdbcEnvironmentInitiator     : HHH000342: Could not obtain connection to query metadata

java.lang.NullPointerException: Cannot invoke "org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(java.sql.SQLException, String)" because the return value of "org.hibernate.resource.transaction.backend.jdbc.internal.JdbcIsolationDelegate.sqlExceptionHelper()" is null
    at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcIsolationDelegate.delegateWork(JdbcIsolationDelegate.java:116) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.getJdbcEnvironmentUsingJdbcMetadata(JdbcEnvironmentInitiator.java:276) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:107) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
    at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:68) ~[hibernate-core-6.4.1.Final.jar!/:6.4.1.Final]
...
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at com.example.cracdbsample.CracDbSampleApplication.main(CracDbSampleApplication.java:16) ~[!/:0.0.1-SNAPSHOT]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]

2024-03-05T20:03:49.846Z  WARN 129 --- [context-demo] [           main] org.hibernate.orm.deprecation            : HHH90000025: PostgreSQLDialect does not need to be specified explicitly using 'hibernate.dialect' (remove the property setting and it will be selected by default)
2024-03-05T20:03:51.902Z  INFO 129 --- [context-demo] [           main] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000489: No JTA platform available (set 'hibernate.transaction.jta.platform' to enable JTA platform integration)
2024-03-05T20:03:51.915Z  INFO 129 --- [context-demo] [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2024-03-05T20:03:54.285Z  INFO 129 --- [context-demo] [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 2 endpoint(s) beneath base path '/actuator'
2024-03-05T20:03:55.788Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Registering JVM checkpoint/restore callback for Spring-managed lifecycle beans
2024-03-05T20:03:56.494Z  INFO 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Triggering JVM checkpoint/restore
2024-03-05T20:03:56.499Z  INFO 129 --- [context-demo] [           main] jdk.crac                                 : Starting checkpoint
2024-03-05T20:03:56.507Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Stopping Spring-managed lifecycle beans before JVM checkpoint
2024-03-05T20:03:56.511Z  INFO 129 --- [context-demo] [           main] jdk.crac                                 : /opt/app/crac-db-sample-0.0.1-SNAPSHOT.jar is recorded as always available on restore
CR: Checkpoint ...

Notice it is still trying to connect to the database even when doing checkpoint on refresh. Then run ./restore-start.sh docker to restore from this checkpoint:

./restore-start.sh docker
+ '[' 1 -lt 1 ']'
+ CONTAINER_TOOL=docker
+ echo 'Using docker'
Using docker
+ docker-compose -f docker-compose-restore.yml up -d
Creating network "crac-db-issue_default" with the default driver
Creating crac-db-sample-restore ... done
Creating db                     ... done
+ docker logs -f --tail -1 crac-db-sample-restore
2024-03-05T20:05:33.858Z  INFO 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Restarting Spring-managed lifecycle beans after JVM restore
2024-03-05T20:05:33.978Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase -2147483647
2024-03-05T20:05:33.982Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'springBootLoggingLifecycle'
2024-03-05T20:05:33.983Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147481599
2024-03-05T20:05:34.589Z  INFO 129 --- [context-demo] [           main] o.s.b.web.embedded.netty.NettyWebServer  : Netty started on port 8080
2024-03-05T20:05:34.643Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerStartStop'
2024-03-05T20:05:34.644Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147482623
2024-03-05T20:05:34.644Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Successfully started bean 'webServerGracefulShutdown'
2024-03-05T20:05:34.644Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147483647
2024-03-05T20:05:34.715Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : Restored CracDbSampleApplication in 1.133 seconds (process running for 1.141)
Not creating users.

This does appear to restore, but I think the error that happened at checkpoint time prevents the restore from properly configuring the datasource because browsing http://localhost:8080/users now gets: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "checkpointuser" which indicates it is using the user credentials from the checkpoint and not the restore environment.

Finally, uncomment the db service in docker-compose-checkpoint.yml and run ./checkpoint.sh docker again. You will see this error:

Doing checkpoint onRefresh

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v3.2.2)

2024-03-05T20:12:31.742Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : Starting CracDbSampleApplication v0.0.1-SNAPSHOT using Java 17.0.10 with PID 129 (/opt/app/crac-db-sample-0.0.1-SNAPSHOT.jar started by root in /)
2024-03-05T20:12:31.761Z  INFO 129 --- [context-demo] [           main] c.e.c.CracDbSampleApplication            : No active profile set, falling back to 1 default profile: "default"
2024-03-05T20:12:36.188Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-03-05T20:12:36.843Z  INFO 129 --- [context-demo] [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 626 ms. Found 1 JPA repository interface.
2024-03-05T20:12:37.541Z  INFO 129 --- [context-demo] [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=d9cd0c48-d530-30d4-bc90-e87b806e4370
2024-03-05T20:12:39.496Z  INFO 129 --- [context-demo] [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2024-03-05T20:12:39.645Z  INFO 129 --- [context-demo] [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 6.4.1.Final
2024-03-05T20:12:39.729Z  INFO 129 --- [context-demo] [           main] o.h.c.internal.RegionFactoryInitiator    : HHH000026: Second-level cache disabled
2024-03-05T20:12:40.334Z  INFO 129 --- [context-demo] [           main] o.s.o.j.p.SpringPersistenceUnitInfo      : No LoadTimeWeaver setup: ignoring JPA class transformer
2024-03-05T20:12:40.407Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2024-03-05T20:12:41.058Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Added connection org.postgresql.jdbc.PgConnection@46894dc5
2024-03-05T20:12:41.066Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2024-03-05T20:12:41.158Z  WARN 129 --- [context-demo] [           main] org.hibernate.orm.deprecation            : HHH90000025: PostgreSQLDialect does not need to be specified explicitly using 'hibernate.dialect' (remove the property setting and it will be selected by default)
2024-03-05T20:12:43.320Z  INFO 129 --- [context-demo] [           main] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000489: No JTA platform available (set 'hibernate.transaction.jta.platform' to enable JTA platform integration)
2024-03-05T20:12:43.330Z  INFO 129 --- [context-demo] [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2024-03-05T20:12:45.352Z  INFO 129 --- [context-demo] [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 2 endpoint(s) beneath base path '/actuator'
2024-03-05T20:12:46.517Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Registering JVM checkpoint/restore callback for Spring-managed lifecycle beans
2024-03-05T20:12:47.018Z  INFO 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Triggering JVM checkpoint/restore
2024-03-05T20:12:47.024Z  INFO 129 --- [context-demo] [           main] jdk.crac                                 : Starting checkpoint
2024-03-05T20:12:47.032Z DEBUG 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Stopping Spring-managed lifecycle beans before JVM checkpoint
2024-03-05T20:12:47.035Z  INFO 129 --- [context-demo] [           main] jdk.crac                                 : /opt/app/crac-db-sample-0.0.1-SNAPSHOT.jar is recorded as always available on restore
2024-03-05T20:12:47.182Z  INFO 129 --- [context-demo] [           main] jdk.crac                                 : Use -Djdk.crac.collect-fd-stacktraces=true to find the source.
2024-03-05T20:12:47.344Z  INFO 129 --- [context-demo] [           main] o.s.c.support.DefaultLifecycleProcessor  : Restarting Spring-managed lifecycle beans after JVM restore
2024-03-05T20:12:47.346Z  WARN 129 --- [context-demo] [           main] onfigReactiveWebServerApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to take CRaC checkpoint on refresh
2024-03-05T20:12:47.353Z  INFO 129 --- [context-demo] [           main] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2024-03-05T20:12:47.389Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown initiated...
2024-03-05T20:12:47.401Z  INFO 129 --- [context-demo] [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown completed.
2024-03-05T20:12:47.462Z  INFO 129 --- [context-demo] [           main] .s.b.a.l.ConditionEvaluationReportLogger : 

Error starting ApplicationContext. To display the condition evaluation report re-run your application with 'debug' enabled.
2024-03-05T20:12:47.522Z ERROR 129 --- [context-demo] [           main] o.s.boot.SpringApplication               : Application run failed

org.springframework.context.ApplicationContextException: Failed to take CRaC checkpoint on refresh
    at org.springframework.context.support.DefaultLifecycleProcessor$CracDelegate.checkpointRestore(DefaultLifecycleProcessor.java:534) ~[spring-context-6.1.3.jar!/:6.1.3]
    at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:193) ~[spring-context-6.1.3.jar!/:6.1.3]
    at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:978) ~[spring-context-6.1.3.jar!/:6.1.3]
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:627) ~[spring-context-6.1.3.jar!/:6.1.3]
    at org.springframework.boot.web.reactive.context.ReactiveWebServerApplicationContext.refresh(ReactiveWebServerApplicationContext.java:66) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:456) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:334) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1354) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-3.2.2.jar!/:3.2.2]
    at com.example.cracdbsample.CracDbSampleApplication.main(CracDbSampleApplication.java:16) ~[!/:0.0.1-SNAPSHOT]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) ~[crac-db-sample-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
Caused by: org.crac.CheckpointException: null
    at org.crac.Core$Compat.checkpointRestore(Core.java:144) ~[crac-1.4.0.jar!/:na]
    at org.crac.Core.checkpointRestore(Core.java:237) ~[crac-1.4.0.jar!/:na]
    at org.springframework.context.support.DefaultLifecycleProcessor$CracDelegate.checkpointRestore(DefaultLifecycleProcessor.java:528) ~[spring-context-6.1.3.jar!/:6.1.3]
    ... 17 common frames omitted
    Suppressed: jdk.internal.crac.mirror.impl.CheckpointOpenSocketException: Socket[addr=db/192.168.144.2,port=5432,localport=44302]
        at java.base/jdk.internal.crac.JDKSocketResourceBase.lambda$beforeCheckpoint$0(JDKSocketResourceBase.java:68) ~[na:na]
        at java.base/jdk.internal.crac.mirror.Core.checkpointRestore1(Core.java:169) ~[na:na]
        at java.base/jdk.internal.crac.mirror.Core.checkpointRestore(Core.java:286) ~[na:na]
        at java.base/jdk.internal.crac.mirror.Core.checkpointRestore(Core.java:265) ~[na:na]
        at jdk.crac/jdk.crac.Core.checkpointRestore(Core.java:72) ~[jdk.crac:na]
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
        at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
        at org.crac.Core$Compat.checkpointRestore(Core.java:141) ~[crac-1.4.0.jar!/:na]
        ... 19 common frames omitted
...

That indicates that we have open connections to the db service when trying to checkpoint.

tjwatson commented 7 months ago

I had to move the env config for SPRING_JPA_PROPERTIES_HIBERNATE_TEMP_USE_JDBC_METADATA_DEFAULTS to the application.yml to get correct behavior when not using the onRefresh checkpoint:

hibernate.temp.use_jdbc_metadata_defaults: false

I updated the branch with the changes. It is unclear why the SPRING_JPA_PROPERTIES_HIBERNATE_TEMP_USE_JDBC_METADATA_DEFAULTS env configuration isn't working. Perhaps the _ of the original property key causes issues when mapping to env keys?

This makes it work when not using onRefresh checkpoint. But if you change to using onRefresh checkpoint then the checkpoint appears to succeed without any connection to the data source. And the application does appear to restore successfully. But then when trying to connect to the /users endpoint we still see org.postgresql.util.PSQLException: FATAL: password authentication failed for user "checkpointuser" which indicates it is using the user credentials from the checkpoint and not the restore environment.

sdeleuze commented 7 months ago

Hi, thanks for this repro, but I am not sure there is an actionable item for us on Spring Framework side.

After fixing a typo (double __ here), the following steps seems to work correctly thanks to your fix on the hibernate property:

./mvnbuild.sh docker
./checkpoint.sh docker
./restore-start.sh docker

I am not sure how and if hibernate.temp.use_jdbc_metadata_defaults binding is supported to work as you seems to have followed the documented rules. May be worth an issue on Spring Boot side as this is a double special case (underscore in the property and coming from Hibernate directly).

The Canonical repository that @OlgaMaciaszek created for this use case is https://github.com/OlgaMaciaszek/crac-db-refresh-scope, you may be interested by the related documentation. And if there is an issue, that looks like more on Spring Cloud than Spring Framework.

As a consequence, I close this issue, but feel free to comment if you identify something we should fix and dig deeper on Spring Framework side.

tjwatson commented 6 months ago

@sdeleuze I needed to follow up with the expected behavior for when using the onRefresh checkpoint. When using onRefresh option I do not see the data source configuration options getting the values as configured from the resource side. It seems to keep using the config values as they were on the checkpoint side.

This seems unexpected. But I am not sure if that is an issue with Spring Cloud or something else in the spring framework that is expected to reflect the latest config from the restore side when using onRefresh so that components can get the right config when the context is refreshed on restore.

sdeleuze commented 6 months ago

It seems to keep using the config values as they were on the checkpoint side

From the Spring Framework perspective, this is expected even if I can undertsand you are surprised. Any post checkpoint configuration update needs to be handled differently, potentially using Spring Cloud.

tjwatson commented 6 months ago

Thanks. I ask because I am trying to understand the scenarios where onRefresh would be advised vs. doing a checkpoint after the application has fully started and letting the lifecycle beans and the other components from cloud refresh things like data sources as in the example at https://github.com/OlgaMaciaszek/crac-db-refresh-scope shows. But this is turning into more of a question answer session and likely should be discussed elsewhere. Thanks for your help.

davysoul commented 4 months ago

I got this erreur 👍 java.lang.NullPointerException: Cannot invoke "org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(java.sql.SQLException, String)" because the return value of "org.hibernate.resource.transaction.backend.jdbc.internal.JdbcIsolationDelegate.sqlExceptionHelper()" is null

yourengineerbro commented 3 months ago

@davysoul have you figured out some way to solve this

vitalioo commented 3 months ago

"java.lang.NullPointerException: Cannot invoke "org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(java.sql.SQLException, String)" ...... bla bla bla"

I had the same problem with this docker-compose.yml:

version: "3.9"

services: db: image: postgres:alpine restart: always env_file:

volumes: minio_storage: {}

I just replaced localhost in line ".. _URL" on db and everything worked: environment:

mryan43 commented 2 months ago

Also experiencing this... only when using oracle though, works fine with postgres

githubanhduong commented 2 months ago

i be this error because url or username or password wrong