SeaQL / sea-orm

🐚 An async & dynamic ORM for Rust
https://www.sea-ql.org/SeaORM/
Apache License 2.0
6.56k stars 459 forks source link

feat: Allow `nest`ing of structs deriving `FromQueryResult` (and `DerivePartialModel`) #2179

Open jreppnow opened 3 months ago

jreppnow commented 3 months ago

PR Info

Hi, this is a separate implementation of #1716 , which seems to have somewhat stalled. Normally I would not cut in from the side like this, but this feature would allow us to cut down on code duplication massively and avoid some quite annoying bugs from re-occurring in our code bases, so we would appreciate if could be merged in the near future.

New Features

Breaking Changes

Changes

Goodjooy commented 3 months ago

The nest feature you implements seems great, should I keep the PR I opened opening?

jreppnow commented 3 months ago

@tyt2y3 Hi, thanks for the comments! I'll get on to addressing them now.

Two things:

1) I will write an integration test, which should also serve as a good example for why this is so useful. What it allows you do is basically have different structs for subqueries of a larger join query, and re-use this in various places. Say I am selecting over table A, left-joining it with table B. In order to only get the fields I need, until now I would have to add all the columns of B into the struct I define for A (rendering it non-reusable in the process), and make them all optional (due to left-join), even if the column is NOT NULL in B.

2) There is potentially a different way to implement this, without needing to actually specify the nested attribute. However, this would most likely involve doing something like impl<T: TryGet> FromQueryResult for T { ... }, which would potentially be a lot more intrusive and definitely a breaking change (as both traits are pub and not sealed..).

jreppnow commented 3 months ago

Ah, also, I would appreciate if we could also get #2167 in the same release (however, that one is a lot more minor and has an easy workaround).

jreppnow commented 3 months ago

I added a couple of test cases, but stumbled over a really annoying issue - when you nest a struct into another which both refer to columns of the same name, but from different tables, sqlx just overwrites the value of the one deserialized first with the second one..

The user can do a workaround (renaming the fields), but this is honestly really ugly and above all, very error-prone (as values actually get overwritten without warning in the "good" case). A better solution would be to do what other ORMs do and rename the columns in the query based on some random, unique string per FromQueryResult struct, but that would also involve changing the non-partial Model logic.

Possible solutions:

  1. Keep DerivePartialModel and FromQueryResult together
    1. My solution would probably be to generate a compile-time random identifier per struct that implements FromQueryResult, and ensure that that is prefixed before every AS in the response query.
    2. Involves changing a lot of other stuff.
  2. Introduce a new DerivePartialModel derive macro (maybe just PartialModel?), which would then produce an independent implementation of FromQueryResult and be incompatible with deriving FromQueryResult. Easiest solution, but involves some duplication + the old version stays in the API for a while, users would have to switch to the new version.
  3. Remove/deprecate DerivePartialModel and incorporate its functionality into FromQueryResult, which would then behave differently based on some flag.
jreppnow commented 3 months ago

Think I figured out a workaround, need a little bit more time though.. I will notify you when I have something.

jreppnow commented 3 months ago

Okay, I think I found a solution which will change slightly how some queries are generated, but should not affect the API otherwise (possible via patch release I would think). Basically, in order to combat the issue of the same id showing up twice in the same query, I use AS directives and and prefix the fields with the field they belong to in the parent struct, ONLY if they are nested (i.e., the behavior is the same if nesting is not used..).

I also added some more tests which hopefully show how useful this is. Two usecases come to mind (and are the reason we wanted this in the first place):

  1. Left-joining tables
    1. Before, you had to specify all the fields of the joined table in the base struct. Also, even though they were not optional in the joined table, you had to attach an Option, since all the columns from that table might just not be there.
    2. Now, you can have a separate struct for the joined table, where the fields Optionality is precisely the one they have in the original table.
  2. Different degrees of details required from the same table(s)
    1. We have a lot of usecases where we are querying the same tables with varying amount of detail required. With this, we can have a base struct with only the essential data in it and then write structs that embed that one (using nested) with additional required columns, without duplicating the entire struct.

For a practical example, using the test cases added:

Before, we were forced to write something along the lines of:

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "cake::Entity")]
struct Cake {
    id: i32,
    name: String,
    #[sea_orm(from_expr = "bakery::Column::Id")
    bakery_id: Option<i32>,
    #[sea_orm(from_expr = "bakery::Column::Name")
    bakery_title: Option<String>,
}

What's particularly annoying about this is that both bakery_id and bakery_title become separate Options, even though the non-NULLness of one necessarily implies the existence of the other. With #[sea_orm(nested)], you can write the following instead:

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "cake::Entity")]
struct Cake {
    id: i32,
    name: String,
    #[sea_orm(nested)]
    bakery: Option<Bakery>,
}

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "bakery::Entity")]
struct Bakery {
     id: i32,
     #[sea_orm(from_col = "Name")]
     title: String,
}

Notice how the existence of the row in the bakeries table is tracked as a single Option, as it should be. Especially with larger tables, this reduces code duplication and error-proneness massively.

In our code base, we generally associate the queries to obtain a struct with the struct (as in, as a function), and we could also use this to model varying degrees of details (and join partners) for larger queries.

jreppnow commented 3 months ago

One caveat: I decided to also do an AS for the members of the lowest in the hierarchy. This changes existing queries, although in what I believe is a compatible way, since the rename is just to name that is expected anyway.

Regarding naming: I went with nested here, which I think is decent, but I could also imagine embed or something along those lines to be nice.

As I mentioned, there are some breaking changes that I would like to make to some traits. I can make a separate PR for them to be included in 1.0.

tyt2y3 commented 2 months ago

Thanks for the massive update! I am going through them

jreppnow commented 1 month ago

@tyt2y3 Just an FYI, this PR (specifically a 0.12.x backport) has been in production use with us for about a month now, without any problems and without breaking any existing queries.