vapor / fluent-kit

Swift ORM (queries, models, and relations) for NoSQL and SQL databases
MIT License
217 stars 116 forks source link

Insert queries error when models have relations defined #564

Closed rausnitz closed 1 year ago

rausnitz commented 1 year ago

I saw this in my test logs, after updating to fluent-kit version 1.42.2:

server: column "account_id" specified more than once (checkInsertTargets)

I believe the insert query is specifying account_id multiple times because the model has both a @Field and a @Parent defined:

@Field(key: "account_id")
public var accountID: Int

@Parent(key: "account_id")
public var account: Account

To Reproduce

Define a model with a @Field property and a @Parent property that refer to the same database column. Then initialize an instance and run the create(on:) method.

(I haven't tried this with relations other than parent.)

Expected behavior

The column shouldn't be repeated in the insert statement.

Environment

I tested this on macOS and Linux. My database is Postgres.

rausnitz commented 1 year ago

A similar error occurs for update queries:

server: multiple assignments to same column "account_id"
rausnitz commented 1 year ago

Looking at the docs, I'm wondering if I implemented relations in a way that they're not designed to be used, i.e. in representing a column as both a @Field and a @Parent. In any case, that approach worked before and doesn't work on 1.42.2.

0xTim commented 1 year ago

@rausnitz so you have two properties in your Swift model that refer to the same column?

rausnitz commented 1 year ago

Yes. I'm not sure why we used that approach for the relations in the first place. Someone must have thought it was correct to do so, and it worked.

gwynne commented 1 year ago

This has never been supported by Fluent; it only worked before due to a rather odd quirk of how SQLQueryConverter was originally implemented. In #557, when I revamped the relevant logic to generate queries in a more deterministic fashion, one of the side effects was changing how the list of columns being inserted or updated is computed. (See below, if you're so inclined, for some extremely pedantic details of what changed and why.)

I'm leaving this issue open for now instead of closing it as wontfix because the fact that Fluent generates an invalid query instead of throwing an error for this case is a bug. (For that matter, I'd be within my rights to throw an assertion failure or even a fatalError(), given that it's programmer error and wouldn't be recoverable at runtime.)


Extremely pedantic details follow. It's okay to skip this part 🙂

Previously, we would take the list of columns given by the set of data values being inserted/updated, which comes from the AnyDatabaseProperty.input(to: any DatabaseInput) methods of the various property types, and the relevant DatabaseInput type is implemented with a Dictionary - the result being that if a column was specified multiple times by a model, the last value specified would be the one used.

The updated logic, however, gets the list of columns by taking the list of fields specified by a query - for insert and update queries this list almost always ultimately comes from iterating the keys of a model's properties, which will yield the set of all properly-specified fields (i.e. any properties using @Field or any of the other Fluent property wrappers). Fields that occur in the field list but not in the set of values are filtered out.

Thus, for a query which contained (for example) fields = ["id", "account_id", "account_id", "created_at"], values = [["id": 1, "account_id": 2, "created_at": Date(), "updated_at": Date()]], the original code would look only at the keys of the first element of the values array and your set of columns would be some arbitrarily-ordered variation of ["account_id", "created_at", "id", "updated_at"], but the new code looks at the fields array and filters it by the value keys, which yields ["id", "account_id", "account_id", "created_at"].

So the change had two effects - multiply-specified columns now break, and manually-specified field lists now constrain the set of inserted/updated values. Both of these are fully in keeping with Fluent's intended behavior and thus could be considered bug fixes, despite that the intent of the change was determinism.

gwynne commented 1 year ago

It's worth noting that thanks to design ~flaws~ *ahem* choices in Fluent (this is an evil eye and it's looking right at you, @tanner0101...), actually checking for duplicate fields while serializing a query is surprisingly difficult, especially if you want to be able to tell the poor soul seeing the error which model made the mistake, or even which query; it's not the quick fix it really should be. I'll keep it in mind as something to deal with.

Frizlab commented 1 year ago

So if I understand correctly, having two properties referring to the same column is officially unsupported by Fluent.

This makes sense, but how are we supposed to deal with the use case of an entity whose ID is also an @Parent? Here’s an example:

final class DbUser : Model {
   @ID var id: UUID?
}
final class DbClientProfile : Model {
   @ID(custom: FieldKeys.userID, generatedBy: .user) var id: UUID?
   @Parent(key: FieldKeys.userID) var user: DbUser
}

For now I’ve pinned fluent-kit to 1.42.1, but of course this is not a viable solution long-term.


Possible solution, is it legal?

While writing this response, I thought about @CompositeID. Is this a valid solution for my use-case?

final class DbUser : Model {
   @ID var id: UUID?
}
final class DbClientProfile : Model {
   final class IDValue : Fields, Hashable {
      @Parent(key: FieldKeys.userID) var user: DbUser
   }
   @CompositeID() var id: IDValue?
   var user: DbUser? {id?.user}
}

Note: I have omitted the boilerplate in DbClientProfile.IDValue for brevity.

Thanks


I tested it

It works. The boilerplate is non-negligible, but it works.

It still wonder whether it’s the best solution, but at least I have a workaround and can drop my version pinning.

0xTim commented 1 year ago

@gwynne can correct me if I'm wrong but I believe Fluent considers multiple columns with the same keys a programmer error. You can use computed properties if needed for most use cases but the ID as the duplicated column will cause problems.

The main issue is when writing to the DB. Let's say you have

final class DbClientProfile : Model {
   @ID(custom: FieldKeys.userID, generatedBy: .user) var id: UUID?
   @Parent(key: FieldKeys.userID) var user: DbUser
}

You then do

let profile = DbClientProfile(id: uuid1, userID: uuid2)
try await profile.save(on: req.db)

What should be the expected value in the DB for the userID column? That's the main thing Fluent is protecting against

gwynne commented 1 year ago

@0xTim You are correct. See https://github.com/vapor/fluent-kit/issues/564#issuecomment-1535559546 (earlier in this issue) for the gory details.

@Frizlab Using a single-property @CompositeID for this case is currently the correct way to do this; I must admit it's one of the reasons I went to a bunch of extra effort to make sure the feature works correctly in that configuration. I use the same solution in production myself. I also use this trick to simplify accessing the parent property - but be warned that it does use a technically unsupported compiler feature that isn't guaranteed to keep working indefinitely (although it can be replaced by macros once Swift 5.9 is released). Macros will also be able to cut down on the composite ID boilerplate (not to mention they'll allow Fluent 5 to dispense with all this messy property wrapper nonsense).