spring-projects / spring-boot

Spring Boot
https://spring.io/projects/spring-boot
Apache License 2.0
73.56k stars 40.31k forks source link

Simplify JPA DDL properties and auto-configuration #40177

Open nateha1984 opened 2 months ago

nateha1984 commented 2 months ago

In Spring Boot 2, having thespring.jpa.generate-ddl property set to true did not perform any updates as long as the corresponding hibernate properties (hibernate.hbm2ddl.auto) were not explicitly set. When updating to Spring Boot 3, the same does not apply; the schema update does take place and the Hibernate sql statement is printed to the logs (show-sql=true). Here's a git repo where the issue can be demonstrated. Steps to duplicate are in the README.

Hibernate Java Docs also indicate that if no value is provided, the default for hibernate.hbm2ddl.auto is none

https://github.com/nateha1984/ubiquitous-spork

quaff commented 2 months ago

https://docs.spring.io/spring-boot/docs/current/reference/html/data.html#data.sql.jpa-and-spring-data.creating-and-dropping

There is also a spring.jpa.generate-ddl flag, but it is not used if Hibernate auto-configuration is active, because the ddl-auto settings are more fine-grained.

quaff commented 2 months ago

https://docs.spring.io/spring-boot/docs/current/reference/html/data.html#data.sql.jpa-and-spring-data.creating-and-dropping

There is also a spring.jpa.generate-ddl flag, but it is not used if Hibernate auto-configuration is active, because the ddl-auto settings are more fine-grained.

I think the current behavior should be improved, I've created https://github.com/spring-projects/spring-boot/pull/40185

quaff commented 2 months ago

https://docs.spring.io/spring-boot/docs/current/reference/html/data.html#data.sql.jpa-and-spring-data.creating-and-dropping

There is also a spring.jpa.generate-ddl flag, but it is not used if Hibernate auto-configuration is active, because the ddl-auto settings are more fine-grained.

I think the current behavior should be improved, I've created #40185

After further investigation, I realize the behavior is correct but the document is wrong.

quaff commented 2 months ago

In Spring Boot 2, having thespring.jpa.generate-ddl property set to true did not perform any updates as long as the corresponding hibernate properties (hibernate.hbm2ddl.auto) were not explicitly set.

@nateha1984 I think it's a bug of Spring Boot 2, and Spring Boot 3 fixed it.

nateha1984 commented 2 months ago

That may be technically correct, but this "bug" has been there for years, with behavior that matched the documentation, and it can now cause unexpected data manipulation if there is a difference in the db shema and entity mappings.

For example, a client had their id field marked as an int type in the entity but their db column in Postgres was a bigint. This mapping inconsistency was there for several years, and they had left the generate-ddl set to true the whole time as well with no issues. When we upgraded to Spring Boot 3, we suddenly saw the datatype change. Granted, this flag shouldn't have set in production and the type was incorrect; however, the behavior was matching documentation so there was no reason to suspect this migration would happen. With a large table, this type migration can lock up a table for a long period of time, and bring incoming transactions to a halt.

I feel like the fix here should be made to match Spring Boot 2 behavior, not just update the documentation. In addition, there is another section it explicitly says with Hibernate, none is the default if the datasource is not embedded (Postgres is not embedded, of course):

You can set spring.jpa.hibernate.ddl-auto explicitly to one of the standard Hibernate property values which are none, validate, update, create, and create-drop. Spring Boot chooses a default value for you based on whether it thinks your database is embedded. It defaults to create-drop if no schema manager has been detected or none in all other cases.

https://docs.spring.io/spring-boot/docs/current/reference/html/howto.html#howto.data-initialization

Debugging showed that the strategy was set to update, not create-drop as well.

quaff commented 2 months ago

For example, a client had their id field marked as an int type in the entity but their db column in Postgres was a bigint. This mapping inconsistency was there for several years, and they had left the generate-ddl set to true the whole time as well with no issues. When we upgraded to Spring Boot 3, we suddenly saw the datatype change.

Hibernate will only add new columns if the strategy is update, it doesn't alter and delete columns.

nateha1984 commented 2 months ago

For example, a client had their id field marked as an int type in the entity but their db column in Postgres was a bigint. This mapping inconsistency was there for several years, and they had left the generate-ddl set to true the whole time as well with no issues. When we upgraded to Spring Boot 3, we suddenly saw the datatype change.

Hibernate will only add new columns if the strategy is update, it doesn't alter and delete columns.

I'm sorry, but that's simply not correct. Debugger shows Hibernate config setting this property to update:

Debug in org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator, line 81. Check databaseAction HashMap and configurationValues HashMap:

Screenshot 2024-04-09 at 12 58 01 PM

Screenshot 2024-04-09 at 12 55 56 PM

Logging output:

Screenshot 2024-04-09 at 1 00 54 PM

Please check the provided sample project. This happens every single time

quaff commented 2 months ago

Hibernate will only add new columns if the strategy is update, it doesn't alter and delete columns.

Sorry, Hibernate will alter column datatype since 6.2.0, see https://hibernate.atlassian.net/issues/HHH-15870.

nateha1984 commented 2 months ago

Regardless, this is potentially destructive functionality that has essentially been defaulted to off for at least 10 years and across two major versions before suddenly changing without warning. It has always defaulted to no action on non-embedded databases, and matched your documentation, since at least 1.0.0.RELEASE: https://docs.spring.io/spring-boot/docs/1.0.0.RELEASE/reference/html/howto-database-initialization.html#howto-initialize-a-database-using-hibernate

Having this functionality change to be defaulted to on, without warning, intentional or not, leaves users open to the possibility of all sorts of unintended consequences. If you want to change the functionality on this, that's fine, but there should be ample warning for such a change and time for users to adjust their settings and prevent any potential issues.

Calling this a documentation defect is at best disingenuous. Please reconsider this "fix" and revert the behavior back to previous functionality.

wilkinsona commented 2 months ago

@nateha1984 please be aware that @quaff isn't a member of the core Spring Boot team. They're an active member of the community and regular contributor that's trying to help you. Describing their help as "at best disingenuous" is unwelcome. By all means disagree with someone's opinion, but please do not bring their intentions and sincerity into question when you do so.

Someone on the core team will look at this in detail when we have time. In the meantime, thank you for your patience and thank you, @quaff, for your input thus far.

nateha1984 commented 2 months ago

@quaff I apologize, this has been a frustrating issue for us to deal with us but I should have chosen my words more carefully. I appreciate you quickly taking up the issue to look into

wilkinsona commented 2 months ago

I've spent some time in the debugger where I have learned the following:

With Spring Boot 2.7.8, an org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl is created with the following properties:

hibernate.transaction.jta.platform=org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform@706eab5d
hibernate.hbm2ddl.auto=update
hibernate.id.new_generator_mappings=true
hibernate.physical_naming_strategy=org.hibernate.boot.model.naming.CamelCaseToUnderscoresNamingStrategy
hibernate.resource.beans.container=org.springframework.orm.hibernate5.SpringBeanContainer@72725ee1
hibernate.connection.handling_mode=DELAYED_ACQUISITION_AND_HOLD
hibernate.implicit_naming_strategy=org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy
hibernate.archive.scanner=org.hibernate.boot.archive.scan.internal.DisabledScanner
hibernate.show_sql=true

The persistence unit is an instance of org.springframework.orm.jpa.vendor.SpringHibernateJpaPersistenceProvider$1.

After merging the properties with those of the persistence unit, the same nine properties are present:

hibernate.transaction.jta.platform=org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform@706eab5d
hibernate.hbm2ddl.auto=update
hibernate.id.new_generator_mappings=true
hibernate.physical_naming_strategy=org.hibernate.boot.model.naming.CamelCaseToUnderscoresNamingStrategy
hibernate.resource.beans.container=org.springframework.orm.hibernate5.SpringBeanContainer@72725ee1
hibernate.connection.handling_mode=DELAYED_ACQUISITION_AND_HOLD
hibernate.implicit_naming_strategy=org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy
hibernate.archive.scanner=org.hibernate.boot.archive.scan.internal.DisabledScanner
hibernate.show_sql=true

SchemaManagementToolCoordinator is called and table alteration is attempted but the class that's involved, Table.sqlAlterStrings is limited to adding columns and will not alter a table to change the type of an existing column. In this case that means that no change is made. This can be seen by running with logging.level.org.hibernate=DEBUG:

2024-04-23 09:46:20.918 DEBUG 48934 --- [           main] org.hibernate.mapping.Table              : No alter strings for table : my_table

On the other hand, if we update MyEntity to add a String column named anotherName, we can see from the logs that the table is altered to add the missing column:

2024-04-23 09:50:52.113 DEBUG 51881 --- [           main] org.hibernate.SQL                        : alter table if exists my_table add column another_name varchar(255)
Hibernate: alter table if exists my_table add column another_name varchar(255)

If we recreate the database (docker compose down && docker compose up) and upgrade to Spring Boot 3.2.4 we can observe a difference in behavior. SchemaManagementToolCoordinator is called and table alteration is attempted as before. This time, however, alteration of the existing column is performed. This is due to an enhancement made in Hibernate 6.2. The following is logged:

2024-04-23T10:04:22.398+01:00 DEBUG 53534 --- [           main] org.hibernate.SQL                        : alter table if exists my_table alter column id set data type bigint
Hibernate: alter table if exists my_table alter column id set data type bigint
wilkinsona commented 2 months ago

In Spring Boot 2, having the spring.jpa.generate-ddl property set to true did not perform any updates as long as the corresponding hibernate properties (hibernate.hbm2ddl.auto) were not explicitly set.

My findings above show that this isn't accurate. Updates are performed as long as Hibernate supports them. It will create tables and add columns but it won't alter existing columns unless you're using Hibernate 6.2 and later.

The 2.7.x docs contain the following statement about the behaviour or spring.jpa.generate-ddl:

There is also a spring.jpa.generate-ddl flag, but it is not used if Hibernate auto-configuration is active, because the ddl-auto settings are more fine-grained.

My findings above show that this isn't accurate. The supplied sample sets spring.jpa.generate-ddl to true, uses the Hibernate auto-configuration, and will perform database updates on start up if Hibernate supports them (for example, it will add new columns).

As far as I can tell, Spring Boot's behavior hasn't changed here. The documentation is inaccurate, but that inaccuracy exists in 2.7.x and 3.x. This inaccuracy combined with a new feature in Hibernate 6.2 means that upgrading to Spring Boot 3.1.x or later can result in schema updates that would not have happened previously.

The release notes for Spring Boot 3.1 contain a small section about Hibernate 6.2:

Spring Boot 3.1 upgrades to Hibernate 6.2. Please refer to the Hibernate 6.2 migration guide to learn about how this may affect your application.

Searching the linked migration guide for alter, it would appear that it makes no mention of the changes made for https://hibernate.atlassian.net/issues/HHH-15870 which is unfortunate particularly as a comment on the issue notes that "we need to have a discussion about whether this is really something that should be turned on by default. (It’s a big change to existing behaviour!)".

Some possible actions to take:

  1. Update the documentation as @quaff has proposed in https://github.com/spring-projects/spring-boot/pull/40185
  2. Update our release notes to highlight the change in Hibernate's behavior
  3. Open a Hibernate issue suggesting an update to their 6.2 migration guide

Beyond these three steps, I don't think there's anything that we could or should do here. As far as I can tell, the behaviour of Spring Boot itself has remained consistent in terms of the configuration that's passed into Hibernate. The change that we have seen has come from a new feature in Hibernate 6.2 that's enabled by default and cannot be disabled with a configuration setting.

@nateha1984 something I have learned while stepping through this is that you could disable Hibernate 6.2's new feature using a custom dialect sub-class. If you extend org.hibernate.dialect.PostgreSQLDialect, override supportsAlterColumnType() to return false and configure Hibernate to use this custom dialect you should see the same schema update behavior as you saw with Hibernate 6.1 and earlier.

nateha1984 commented 2 months ago

Ok, thanks for the info @wilkinsona, I guess there's nothing to be done. Very disappointed that this wasn't called out by Hibernate and it's not something Spring can easily disable given how destructive this can be. Apologies again to @quaff for my rash comments prior

wilkinsona commented 2 months ago

Looking at this some more, it's not as simple as a documentation fix. We also have a problem with the handling of spring.jpa.hibernate.ddl-auto. Setting it to none when spring.jpa.generate-ddl is also set to true results in updates to the schema being made. This happens because a value of none results in the hibernate.hbm2ddl.auto entry being removed from a map of properties:

https://github.com/spring-projects/spring-boot/blob/6e37567a9d9932fdecd8ace18bb28a570e0827a9/spring-boot-project/spring-boot-autoconfigure/src/main/java/org/springframework/boot/autoconfigure/orm/jpa/HibernateProperties.java#L88-L93

This is done before HibernateJpaVendorAdapter maps the true value of the generate-ddl flag to setting hibernate.hbm2ddl.auto to update so we end up with update instead of none. The problem does not occur if spring.jpa.properties.hibernate.hbm2ddl.auto is used instead of spring.jpa.hibernate.ddl-auto.

We're paying a heavy price for having three different ways of configuring the same thing.

We used to set it to none but that was changed way back in 1.1 as it results in a warning that none was an unrecognised value. That warning no longer appears to be a problem. It appears to have been addressed in Hibernate 5.1.

wilkinsona commented 2 months ago

We've opened #40503 for some documentation improvements. We're going to hold off on any behavior changes until 3.4. We'll use this issue for those.