Database capability type supporting multiple databases and tables

nhweston commented 3 years ago

This PR extends on #54 by allowing multiple databases and tables to be used in a single program.

Overview

Suppose we have a database with the following schema:

CREATE TABLE foo (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL
);

CREATE TABLE bar (
    id INTEGER PRIMARY KEY,
    a TEXT NOT NULL,
    b TEXT NOT NULL
);

If we wanted to write a program that is able to access this database, we could specify the following top-level argument:

db : Database({
  foo : Table({
    id : String,
    name : String
  }),
  bar : Table({
    id : String,
    a : String,
    b : String
  })
})

The Database constructor takes a record describing the tables accessible in the database. The name of each field is the name of a table, and the type of the field is Table applied to a record describing the respective table schema.

In this example, the main program could use db.foo.all() to obtain all rows in foo, and db.bar.all() for bar.

The user no longer needs to specify the tables in the command-line argument. The command-line argument is only the database URI (i.e. the file path in the case of SQLite). The accessible tables are specified by the programmer in the top-level type.

Implementation

A significant challenge in implementation is the discrepancy between what it is known at compile-time versus that at runtime. The expected database schema is encoded as a type, thus it known at compile-time, but not at runtime as types are erased. For validation (i.e. ensuring the expected and actual schemas are compatible) to occur at runtime, the compiler must somehow make this type information must be made available to the interpreter.

The solution taken is that the compiler encodes the expected schema in the name of the capability (Compiler.scala:167, Database.scala:108). In the example above, the capability name would be:

DatabaseClient::0::foo:id,name::bar:id,a,b

The 0 is used to identify the database (as there may be multiple databases used by the same program), in this case specifying that the database corresponds to the first command-line argument. This is followed by the tables (separated by ::), each with their name (followed by :) and columns (separated by commas).

At runtime, when the capability is instantiated, the interpreter decodes the schema from the capability name (Primitives.scala:262, Database.scala:113). This is then used to validate the schema (Database.scala:21), store the names of tables and columns (Database.scala:63), and construct the capability object (Primitives.scala:266).

Testing

See DatabaseClientTests.scala for the test suite. Source files and database files are located in src/test/resources/capability/db. The SQL used to create the database files is included for reference (but are not used by the tests).

inkytonik commented 3 years ago

Thx, Nicholas. I will try to have a look before tomorrow's meeting, but things are busy at the moment, so it might have to wait until the weekend.

inkytonik commented 3 years ago

Sorry for the delay This looks good to be merged.

inkytonik / cooma