groue / GRDB.swift

A toolkit for SQLite databases, with a focus on application development
MIT License
6.78k stars 702 forks source link

Can't run recursive CTE to get back an Int64 #1555

Closed glytch closed 3 months ago

glytch commented 3 months ago

What did you do?

I wanted to find random records in my database. And SELECT * FROM table ORDER BY RANDOM() is slow. So instead I made a CTE that gets the max value for primary keys in the table, then generate a random number between 0 and that max, repeat until it finds a number that is a valid ID for a record in table.

What did you expect to happen?

I expected to get back rows with a single column containing Ints. And to be able to take one of them to use. When I run this directly in SQLite command line it works as expected and returns a valid record id as its last row.

What happened instead?

Fatal error: fetching random state id with CTE: SQLite error 1: no such column: states - while executing `WITH RECURSIVE "state_ids"("random_state_id") AS (    SELECT
        ABS(RANDOM() % seq) AS random_state_id
    FROM sqlite_sequence
    WHERE
      sqlite_sequence.name = "states"

    UNION ALL

    SELECT
        ABS(RANDOM() % seq) AS random_state_id
    FROM state_ids, sqlite_sequence
    LEFT OUTER JOIN states ON (states.id = random_state_id)
    WHERE
      sqlite_sequence.name = "states"
    AND states.id IS NULL
    LIMIT 5000) SELECT * FROM "state_ids"`

Environment

GRDB flavor(s): Regular GRDB version: Master Installation method: Added project to Xcode Xcode version: Version 15.4 (15F31d)

Swift version:

swift-driver version: 1.90.11.1 Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4) Target: x86_64-apple-macosx14.0

Platform(s) running GRDB: macOS

macOS version running Xcode: Sonoma 14.5

Demo Project

GRDB Recursive CTE Test.zip

import Foundation
import GRDB
import os.log

let sqlLogger = OSLog(subsystem: "bugs", category: "sql")

let dbPool:DatabasePool

do {
  let fileManager = FileManager.default

  let rootDirectoryURL = try fileManager.url(
    for: .downloadsDirectory,
    in: .userDomainMask,
    appropriateFor: nil,
    create: true
  )

  try fileManager.createDirectory(at: rootDirectoryURL, withIntermediateDirectories: true)
  let databaseURL = rootDirectoryURL.appendingPathComponent("recursive_random_cte.sqlite3")

  NSLog("Database stored at \(databaseURL.path)")

  var configuration = Configuration()

  configuration.prepareDatabase { db in
    db.trace(options: [.profile]) {
      os_log("%{public}@", log: sqlLogger, type: .debug, String(describing: $0))
    }
  }
  configuration.publicStatementArguments = true

  dbPool = try DatabasePool(
    path: databaseURL.path,
    configuration: configuration
  )

  var migrator = DatabaseMigrator()

  migrator.registerMigration("001_create_states") { db in
    try db.create(table: "states") { t in
      t.autoIncrementedPrimaryKey("id").notNull()
      t.column("name", .text).notNull()
    }
  }

  try migrator.migrate(dbPool)
} catch {

  fatalError("Unresolved error \(error)")
}

let names:[String] = [
  "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut",
    "Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas",
    "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota",
    "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey",
    "New Mexico", "New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma", "Oregon",
    "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah",
    "Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming"
]

do {
  for name in names {
    try dbPool.write { db in
      try db.execute(sql: "INSERT INTO states (name) values (?)", arguments: [name])
    }
  }
}
catch {
  fatalError("Inserting states: \(error)")
}

do {
  try dbPool.read { db in
    if let row:Row = try Row.fetchOne(db, sql: "SELECT * FROM states ORDER BY RANDOM() LIMIT 1") {

      print("Fetched random row: \(row)")
    }
  }
} catch {
  fatalError("fetching random state with RANDOM(): \(error)")
}

do {
  let cte = CommonTableExpression(
    recursive: true,
    named: "state_ids",
    columns: ["random_state_id"],
    sql: """
            SELECT
                ABS(RANDOM() % seq) AS random_state_id
            FROM sqlite_sequence
            WHERE
              sqlite_sequence.name = "states"

            UNION ALL

            SELECT
                ABS(RANDOM() % seq) AS random_state_id
            FROM state_ids, sqlite_sequence
            LEFT OUTER JOIN states ON (states.id = random_state_id)
            WHERE
              sqlite_sequence.name = "states"
            AND states.id IS NULL
            LIMIT 5000
        """)

  try dbPool.read { db in
    let rows:[Row] = try cte.all().with(cte).fetchAll(db)

    print("rows: \(rows)")
  }
}
catch {
  fatalError("fetching random state id with CTE: \(error)")
}
glytch commented 3 months ago

I was originally trying to use the output of this and join it to one of my GRDB models. But it fails for the same reason at the CTE level. Running it directly in sqlite either alone to return a state ID, or joined with the states to return the random state works.

    let association = State(
        to: cte,
        on: { left, right in
          left[Column("id")] == right[Column("random_state_id")]
        })

try State.select(literal: "states.*").with(cte).including(required: association).limit(1).fetchOne(db)
glytch commented 3 months ago

Incase it helps, this fails with the same error. Works directly in sqlite console though

do {
  let sql =  """
        WITH state_ids AS (
            SELECT
                ABS(RANDOM() % seq) AS random_state_id
            FROM sqlite_sequence
            WHERE
              sqlite_sequence.name = "states"

            UNION ALL

            SELECT
                ABS(RANDOM() % seq) AS random_state_id
            FROM state_ids, sqlite_sequence
            LEFT OUTER JOIN states ON (states.id = random_state_id)
            WHERE
              sqlite_sequence.name = "states"
            AND states.id IS NULL
            LIMIT 5000
        )
        SELECT states.id
        FROM state_ids
        JOIN states ON (states.id = state_ids.random_state_id)
        LIMIT 1
        """

  try dbPool.read { db in
    let stated:Int64 = try Int64.fetchOne(db, sql: sql)

    print("rows: \(stateId)")
  }
}
catch {
  fatalError("fetching random state id with sql cte: \(error)")
}
groue commented 3 months ago

Hello @glytch.

You're running into an SQLite error because your app tries to run an invalid SQL query: "no such column: states".

This is not an issue about GRDB, and this repository is not the correct place to ask support about SQL. Maybe ask for help on the SQLite forums or StackOverflow?

glytch commented 3 months ago

Hello @glytch.

You're running into an SQLite error because your app tries to run an invalid SQL query: "no such column: states".

This is not an issue about GRDB, and this repository is not the correct place to ask support about SQL. Maybe ask for help on the SQLite forums or StackOverflow?

It is a valid query.

It works in the sqlite3 command line client, and the SQLiteFlow Mac OS desktop app. Both sqlite3 clients. Using the same database that the example project above generated. And running the same query that my swift code in the example project uses. I'm not sure why it wouldn't work in GRDB. It returns rows.

$ sqlite3 ~/Downloads/recursive_random_cte.sqlite3
SQLite version 3.43.2 2023-10-10 13:08:14
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE grdb_migrations (identifier TEXT NOT NULL PRIMARY KEY);
CREATE TABLE IF NOT EXISTS "states" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "name" TEXT NOT NULL);
CREATE TABLE sqlite_sequence(name,seq);
sqlite> select * from states limit 2;
1|Alabama
2|Alaska
sqlite> WITH state_ids AS (
(x1...>             SELECT
(x1...>                 ABS(RANDOM() % seq) AS random_state_id
(x1...>             FROM sqlite_sequence
(x1...>             WHERE
(x1...>               sqlite_sequence.name = "states"
(x1...>             
(x1...>             UNION ALL
(x1...>             
(x1...>             SELECT
(x1...>                 ABS(RANDOM() % seq) AS random_state_id
(x1...>             FROM state_ids, sqlite_sequence
(x1...>             LEFT OUTER JOIN states ON (states.id = random_state_id)
(x1...>             WHERE
(x1...>               sqlite_sequence.name = "states"
(x1...>             AND states.id IS NULL
(x1...>             LIMIT 5000
(x1...>         )
   ...>         SELECT states.*
   ...>         FROM state_ids
   ...>         JOIN states ON (states.id = state_ids.random_state_id)
   ...>         LIMIT 1;
113|Illinois
groue commented 3 months ago

Oh, now I see it. The problem comes from the "states", wrapped in double quotes, in the SQL. This is not standard SQL, and GRDB has instructed SQLite to reject it.

Replace them with 'states' (wrapped in single quotes). The SQL query will conform to the SQL standard, and the error will be fixed.

There is a FAQ about this SQLite error: SQLite error 1 "no such column".

glytch commented 3 months ago

That was it. Thanks @groue :)

groue commented 3 months ago

You're welcome! Happy GRDB!