vapor / fluent

Vapor ORM (queries, models, and relations) for NoSQL and SQL databases
https://docs.vapor.codes/4.0/fluent/overview/
MIT License
1.3k stars 171 forks source link

Selecting specific .fields in query with .join/.with results in FATAL ERROR #733

Open m-y-n-o-n-a opened 2 years ago

m-y-n-o-n-a commented 2 years ago

Issue

Problem:

1) Situation: Applying a JOIN operation via .join or .with 2) Next, if you specify to retrieve only specific fields (instead of returning the whole set) you get a FATAL ERROR: "Cannot access field before it is initialized or fetched"

Expected behavior:

Even when I apply a JOIN operation I am able to define what fields should be included in the result set.

Why is this important?

Optimize data exchange between client and server.

The second reason is security related: We should not expose all fields in case some contain non public information.

Versions

Vapor: 4.51.0 Fluent: 4.4.0 Leaf: 4.1.3 Database driver: Fluent-mysql-driver 4.0.1 Operating system: MacOS Monterey MySQL: 8.0.27

Concrete example

PARENT – CHILD join SIBLING join

struct HomepageViewControllerLeaf: Encodable {

   struct IndexContext: Encodable {
      let title: String
      let articles: [Article]
   }

   ///      =============================================================
   ///      Database queries
   ///      =============================================================

   func getArticles(_ req: Request) throws ->EventLoopFuture<[Article]> {

      let result: EventLoopFuture<[Article]> =  Article
         .query(on: req.db)
         .with(\.$author) // include fields from parent in result set
         .with(\.$tags) // include fields from sibling in result set
         .sort(\.$date, .descending) // sort articles by date
         .all()

      // FIELD SELECTION TRIGGERS FATAL ERRORS IN QUERY – WHY?
      //.field(\.$title)
      //.field(Author.self, \.$first_name)

      //.join(Author.self, on: \Article.$author.$id == \Author.$id, method: .inner)
      // joins but does not include fields from parent in result set

      return result
   }

   ///      =============================================================
   ///      Views
   ///      =============================================================

   func indexHandler(req: Request) throws -> EventLoopFuture<View>  {

      // Database query
      let articles = try getArticles(req)

      // Leaf renderer view
      return articles.flatMap { articles in
         let context = IndexContext(title: "Home",
                                    articles: articles)
         return req.view.render("Home/homeLEAF", context)
      }
   }

}

This code results in the correct result:

["[title: "Headline", author: "[first_name: "FirstName", id: "8C5AFB78-3B44-11EC-8AA0-9C18A4EEBEEB", last_name: "LastName"]", status: "DRAFT", friendly_url: "bla", image_link: "", id: "1D24175C-3B45-11EC-8AA0-9C18A4EEBEEB", content: "Hier steht der Text", tags: "["[id: "B5B65D0A-3F8B-11EC-8D56-3271A1D549EB", tag: "Service Design"]", "[id: "BC8AAEB0-3F8B-11EC-8D56-3271A1D549EB", tag: "User Value"]"]", date: "1635724800.0", excerpt: "Kurzzusammenfassung dieses Artikels"]"]

But as soon as the fields are specified the result is a fatal error:

FluentKit/Field.swift:23: Fatal error: Cannot access field before it is initialized or fetched

If it would be done via a raw SQL query this is the desired outcome:

SELECT 
    a.id AS id,
    title,
    image_link,
    excerpt,
    friendly_url,
    DATE_FORMAT(date, '%d %b %Y') AS date,
    CONCAT(b.first_name, ' ', b.last_name) AS author,
    JSON_ARRAYAGG(t.tag) AS tags
FROM
               article a
    INNER JOIN author b
            ON a.author_id = b.id
    INNER JOIN article_tag_map bridge
            ON a.id = bridge.article_id
    INNER JOIN tag t 
            ON t.id = bridge.tag_id
WHERE
    status = 'DRAFT'
GROUP BY
    a.id
ORDER BY 
    a.date DESC;

   func getSQL(_ req: Request) throws ->EventLoopFuture<[Row]> {

      let sql_query: SQLQueryString = "SELECT      a.id AS id,     title,     image_link,     excerpt,     friendly_url,     DATE_FORMAT(date, '%d %b %Y') AS date,     CONCAT(b.first_name, ' ', b.last_name) AS author,         JSON_ARRAYAGG(t.tag) AS tags FROM                article a     INNER JOIN author b             ON a.author_id = b.id     INNER JOIN article_tag_map bridge             ON a.id = bridge.article_id  INNER JOIN tag t              ON t.id = bridge.tag_id WHERE  status = 'DRAFT' GROUP BY     a.id ORDER BY   a.date DESC LIMIT 0, 10000"

      let result: EventLoopFuture<[Row]> = (req.db as! SQLDatabase)
         .raw(sql_query)
         .all(decoding:Row.self)

      return result
   }

Result of the raw SQL query:

["[date: "01 Nov 2021", id: "1D24175C-3B45-11EC-8AA0-9C18A4EEBEEB", excerpt: "Kurzzusammenfassung dieses Artikels", tags: "["Service Design", "User Value"]", title: "Headline", friendly_url: "bla", author: "FirstName LastName", image_link: ""]"]

0xTim commented 2 years ago

@mynonaGithub you're mixing eager loading and joining. The crash you're getting is because you're eager loading the relation but then telling it to only select a subset of fields. That won't work because the eager loader needs to fully construct the model.

What you should do is JOIN and then get the joined model out as described in the docs. That should work when specifying the fields

m-y-n-o-n-a commented 2 years ago

can you provide a code example of a join with only a subset of the fields?

i tried joins as you can see in the provided code example but still all fields were part of the result set.

when i researched this topic i found out that this issue is quite a often discussed problem.

the way i solved it was to create additional (public) models with the reduced subset of fields for the queries but this is a really bad (redundant) approach.

0xTim commented 2 years ago
let result: EventLoopFuture<[Article]> =  Article
         .query(on: req.db)
         .join(Author.self, on: \Article.$author.$id == \Author.$id, method: .inner)
         .sort(\.$date, .descending) // sort articles by date
         .field(\.$title)
         .field(Author.self, \.$first_name)
         .all()

Note that using a join will not fill in the joined models in a Content response - you'll either need to manually add those in with article.$author.value = try article.joined(Author.self) or use a DTO as it looks like you have done