dyedgreen / deno-sqlite

Deno SQLite module
https://deno.land/x/sqlite
MIT License
409 stars 36 forks source link

Add SQL`` tagged template strings with bound values #104

Closed jeremyBanks closed 3 years ago

jeremyBanks commented 3 years ago

This PR adds support for SQL tagged template strings with interpolated bound values, as a secondary alternative to the standard db.query(sql, [...values]) method of binding values. This is similar to the functionality of the sql-template-strings NPM module, which is supported supported by the mysql, pg, and sequelize NPM modules.

import * as sqlite from 'https://deno.land/x/sqlite/mod.ts';
export { SQL } from "https://deno.land/x/sqlite/mod.ts";

db.query(SQL`CREATE TABLE Records (id INTEGER PRIMARY KEY, name TEXT)`)
== db.query("CREATE TABLE Records (id INTEGER PRIMARY KEY, name TEXT)");

const name = "Dino";
db.query(SQL`INSERT INTO Records (name) VALUES (${name)`)
== db.query("INSERT INTO Records (name) VALUES $", [name]);

It is possible to compose such expressions. Internally they'll all be flattened down to the single sql string and flat array of bound values, so the current query() logic is otherwise unchanged.

const value = SQL`${name}`;
const columnsClause = SQL`(name)`;
const valuesClause = SQL`VALUES (${value})`;
const sql = SQL`INSERT INTO Records ${columnsClause} ${valuesClause}`;
db.query(sql)
== db.query("INSERT INTO Records (name) VALUES $", [name]);

Above is what should be relevant for most users of deno-sqlite.

In order to support other library or framework developers building on top of the deno-sqlite, this PR also provides the ability for classes to declare how they can be interpreted when interpolated in a SQL string:

class ApiResponse {
  constructor(readonly data: any) {}

  [sqlite.toSQL](): string {
    return JSON.stringify(this.data);
  }
}

const response = new ApiResponse(await (await fetch(...)).json());
db.query(SQL`INSERT INTO Responses (data) VALUES (${response})`)
== db.query("INSERT INTO Response (data) VALUES (?)", JSON.stringify(response.data));

And also provides a utility function for safely encoding strings for use as SQLite identifiers (based on this old Q&A pair I wrote on Stack Overflow, and having taken a deeper look at SQLite's parser recently):

const myTableName = sqlite.encodeIdentifier("a very silly table name!");
db.query(SQL`CREATE TABLE ${myTableName} (id INTEGER PRIMARY KEY)`)
== db.query("CREATE TABLE [a very silly table name!] (id INTEGER PRIMARY KEY)");

Table names are the place I most often see people unsafely concatenating SQL these days (😱), so I think it could valuable to the ecosystem to place this safer alternative at-hand.

dyedgreen commented 3 years ago

Hey @jeremyBanks, thank you for the PR. (I've been quite busy recently, apologies for taking so long to get back to you.)

I'm generally into this idea, but I feel like this could be even better served as a separate package. I think what might work really well (for the deno database ecosystem in general), would be to have a single package which provides an sql tag.

Specifically, I was thinking of something like this:

// then modify Db.query such that class DB { query(sql: string | QueryInterface, values?: object | QueryParam[]): Rows { /.../ } }


- A separate SQL package provides an SQL tag, which allows you to interpolate values and produces valid SQL. This tag can now work with multiple DB backends, by conforming to a similar interface they might define. This also allows the project to be bigger in scope. Specifically, I think it would be really neat for something like this to do a full parse of the query, such that it can escape bound identifiers automatically (instead of having to remember the encode identifier function).
- Further an ORM could implement this type for it's query builder

Actually, thinking further... Maybe we could get together with the other SQL database packages and define a standard way to interact with the databases. (Such that an ORM or similar can easily work with all the available backends.)
dyedgreen commented 3 years ago

(This is all just me thinking random thoughts currently, do let me know if you think something like this is sensible 😅)

jeremyBanks commented 3 years ago

I thought about the separate package approach a bit, but not as fully as what you proposed, which is a very interesting idea.

The reason I leaned in the direction of including it directly in deno-sqlite is because, even though they're very similar, different SQL dialects aren't entirely the same, and it may be misleading to conflate them into a single type. In this case, the syntax that the encodeIdentifier function uses to escape the identifier (square brackets or backticks) is chosen because SQLite unambiguously recognizes it as an identifier. However, TSQL doesn't recognize the backticks, and MySQL doesn't recognize the square brackets. It assumes UTF-8, and other engines may allow other encodings. Interpolating a value that has been escaped for SQLite into a SQL`...` string that is going to be used for MySQL might be able to create a subtle security vulnerability due to encoding differences. I would like this to be as fool-proof as practical.

"Make it hard to do a potentially wrong thing, make it easy to do the right thing." Probably with a not-obvious escape hatch for people who really need it. I like to hope that your library will be among the most frequently-used SQL packages anywhere in some number of years, and promoting a harder-to-misuse pattern like this may help a lot of people avoid security mistakes in the future.

My impression is that use dynamic identifiers, though a relatively common source of error, are still somewhat absolutely infrequent, and when they are used, it's only a handful of places in a code base. The friction/verbosity of needing to call a specific method for them may be appropriate on those grounds, but the idea of detecting it automatically would also be cool if it is practical.

I'll think about this a bit more. 🙂

jeremyBanks commented 3 years ago

I haven't fully thought this out, but one idea:

With the generic capabilities in modern TypeScript, it might be possible to implement such a package in a way that prevents using strings with the wrong database engine when they're known to be incompatible, without getting in the way when they're unknown.

SQLExpressions instance (or SQLStrings, maybe) could have an optional Dialect generic type parameter. Dialects would be types extending...

export type SQLDialect = {
  name: string,
  bindableTypes: unknown,
};

...identifying the database engine syntax and the driver's compatible types for bound values.

By default, a SQL-tagged string would produce a SQLString<UnknownDialect>, which allows any type of bound value. However, if a SQLString is interpolated that is already marked as a different dialect, such as SQLString<sqlite.SQLiteDialect>, the resulting string would be typed with the SQLite dialect. Database drivers like deno-sqlite could accept SQLStrings that are of either <UnknownDialect>, or of their expected Dialect. So if no dialects are specified, the user won't be bothered, but if one is specified (or a dialect-specific function like sqlite.encodeIdentifier() is called), they'll be prevented from using it somewhere that we know is wrong.

import SQL, {UnknownDialect} from 'https://deno.land/x/hypothetical-package-name/mod.ts';
import sqlite, {SQLiteDialect} from 'https://deno.land/x/sqlite/mod.ts';
import mysql, {MySQLDialect} from 'https://deno.land/x/hypothetical-mysql-package-name/mod.ts';

const tableUnknown = SQL`Users`;                         // is a SQLString<UnknownDialect>
const queryUnknown = SQL`SELECT * FROM ${tableUnknown}`; // is a SQLString<UnknownDialect>

const tableSqlite = SQL<SQLiteDialect>`Users`;          // is a SQLString<SQLiteDialect>
const tableSqlite2 = sqlite.encodeIdentifier('Users');  // is a SQLString<SQLiteDialect>
const querySqlite = SQL`SELECT * FROM ${tableSqlite}`;  // is a SQLString<SQLiteDialect> (inferred from interpolated value)

const tableMySQL = SQL<MySQLDialect>`Users`;            // is a SQLString<MySQLDialect>
const queryMySQL = SQL`SELECT * FROM ${tableMySQL}`;    // is a SQLString<MySQLDialect>

// Mixing known-different dialects should produce a SQLString<never> type, or an error, or something.
const queryIncoherent = SQL`SELECT * FROM ${tableMySQL} m JOIN ${tableSqlite} l ON m.Id = l.UserId`;
jeremyBanks commented 3 years ago

I am working on a more fleshed-out design proposal.

jeremyBanks commented 3 years ago

closing as this is to be subsumed into https://github.com/jeremyBanks/database/pull/3