GoogleCloudPlatform / spring-cloud-gcp

New home for Spring Cloud GCP development starting with version 2.0.
Apache License 2.0
423 stars 315 forks source link

Support for foreign key/joins in Spring Data Cloud Spanner #737

Open bowenjin opened 2 years ago

bowenjin commented 2 years ago

Is your feature request related to a problem? Please describe. The documentation for Spring Data for Cloud Spanner claims that foreign key constraints are not natively supported in cloud spanner. Yet according to this documentation, they are supported. While it seems interleaving is the ideal solution for one-to-many relationships, foreign keys are necessary for many-to-many relationships and sometimes for one-to-one. However it seems to ORM mapping annotations for spanner only supports @Interleaved. What's the current suggested solution for mapping @OneToOne and @ManyToMany relationships?

Describe the solution you'd like Adding @OneToOne, @OneToMany, ManyToOne, and @ManyToMany mapping annotations to the com.google.cloud.spring.data.spanner.core.mapping package for cases where we want to use foreign keys rather than interleaving.

elefeint commented 2 years ago

Support for full JPA specification is out of scope for the Spring Data Spanner module, but with Spanner introducing support for general-purpose foreign keys in March 2020, it may make sense to support a subset such as OneToOne and OneToMany.

I will leave this feature request open to assess community interest.

Current workarounds:

meltsufin commented 2 years ago

You're absolutely right in that our documentation is wrong about Spanner not supporting foreign keys. It was written before foreign keys support was added and had not been updated. We should fix it.

That being said, there is nothing preventing you from using foreign key constraints in your Spanner schema while still using Spring Data Spanner. One place we can add foreign keys support is in our schema generation, but I'm not sure if it would make sense without actually introducing new supported annotations like @OneToMany.

Like @elefeint says above, if you're looking for a JPA-like ORM, there is Hibernate support that allows to use all of these annotations with Spring Data JPA + Spanner.

bowenjin commented 2 years ago

Ah thank you guys, the Hibernate dialect + Spring Data JPA seems like a suitable alternative. Though it would be awesome if Cloud Spanner Spring Data supported some annotations for foreign keys. Hope this feature request gets some interest!

mpeddada1 commented 2 years ago

Thanks for the responses and for filing the feature request! We will be leaving this issue open. Please feel free to give it a thumbs up if you would like to see this implemented.

2021H1030044G commented 2 years ago

Can I contribute in the above issue?

elefeint commented 2 years ago

@2021H1030044G #874 is a better fit for a new contributor.

akstatic commented 1 year ago

I have a question related to OneToOne relationship. In the spring cloud documentation it is mentioned that While one-to-one and many-to-many relationships can be implemented in Cloud Spanner and Spring Data Cloud Spanner using constructs of interleaved parent-child tables, only the parent-child relationship is natively supported. Does that mean Spring Data Cloud Spanner advises to use Interleaved relationships for One to One relationships? Essentially One to Many collection but child to have only one entry?

meltsufin commented 1 year ago

I think the idea is that accessing interleaved tables is more efficient because they're collocated in the database.

cc: @olavloite

olavloite commented 1 year ago

I think the idea is that accessing interleaved tables is more efficient because they're collocated in the database.

cc: @olavloite

Correct:

  1. Rows in interleaved tables are phyiscally co-located with their parent row.
  2. Accessing a child or the children of a parent row together with the parent row is more efficient than if the tables are not interleaved. Also; fetching all the children of a specific parent will also be more efficient, as these are also all co-located.

So you should prefer using an interleaved table if you have a many-to-one or one-to-one relationship where you mostly access both tables simultaneously, for example by joining the tables in your queries, or if you always select all the children of a given parent row. This as opposed to using two non-interleaved tables and defining a foreign key constraint on the child table. That would also work, and can be a valid solution if you don't always fetch data from both tables in the same query, or don't mostly select all children of a given parent.

akstatic commented 1 year ago

@meltsufin @olavloite Thanks for your inputs.