Truebase-com / TruthStack

Monorepo for the Truth technology stack.
10 stars 1 forks source link

TruthTalk Proposal #12

Open paul-go opened 5 years ago

paul-go commented 5 years ago

TruthTalk is a protocol for exchanging data between application-level logic and a persistence layer, which need not be on the same machine. TruthTalk operates on a query/response basis. TruthTalk queries are constructed with the TruthTalk API, which is the focus of this document. TruthTalk responses are simply just a stream of data that conforms to the Truth Data specification.

TruthTalk is not like SQL. Instead, it's designed to use more of a functional style. Conceptually, TruthTalk queries can be thought of as similar to the multiple-cursors feature present in many text editors such as Visual Studio Code, Sublime Text, Atom, and others. TruthTalk queries start with a "cursor" positioned at every top-level type in a block of Truth data. Operations are then chained together sequentially, which move these "cursors" throughout the data set, in order to capture a particular result set.

TruthTalk Queries

The TruthTalk query constructions begin with the tt(..) function. (For now, the tt() function is globally-accessible.)

const query = tt(
    // Query primitives go here
);

Examples

The examples below assume the following Truth code:

String
/".+" : String

Number
/\d+ : Number

Boolean
true, false : Boolean

Employee
    Name : String
    Salary : Number

They also assume the following Truth data:

E1 : Employee
    Name : "Bob"
    Salary : 1000

E2 : Employee
    Name : "Alice"
    Salary : 1200

E2 : Employee
    Name : "Joe"
    Salary : 1400

The simplest TruthTalk query is the one that selects all top-level objects.

const query = tt();
// Returns E1, E2, E2

The next query finds all employee objects, starting at the root. The result set is therefore the same as above. Queries are able to take constructor function PLAs emitted by Backer as parameters. When these are passed into the query, the result set is limited to only objects that are instances of the passed type.

const query = tt(
    Employee
);

This query finds any object, starting at the root, that has any content type that is a String. The query would return all 3 Employees from the data set, because they all have at least one content type, whose base type is a String.

const query = tt(
    tt.has(String)
);

This query finds objects that have a specific value:

const query = tt(
    tt.has(
        tt.equals("Bob")
    )
)

The .has() function accepts multiple parameters, which search for a single type that abides by all of these parameters. The query below therefore returns a single object for the employee "Joe" (who has the only salary > 1300)

const query = tt(
    tt.has(
        Employee.Salary
        tt.greaterThan(1300)
    )
);

If a query needs to check the value of two separate content types, this is done by using two separate .has() operations:

const query = tt(
    tt.has(
        Employee.Salary,
        tt.greaterThan(1000)
    ),
    tt.has(
        Employee.Name,
        tt.equals("Joe")
    )
);

By default, when multiple operations are chained together, it's assumed that they all relate to each other via logical AND. For example, the query above means "Objects that have a Salary greater than 1000, and a Name equaling Joe". However, instead of AND, we could get the OR behavior by wrapping the operations in a tt.or(...) block. For example:

const query = tt(
    tt.or(
        tt.has(
            Employee.Salary,
            tt.greaterThan(1000)
        ),
        tt.has(
            Employee.Name,
            tt.equals("Bob")
        )
    )
);

Similar to logical OR, logical NOT (or negation) is also possible. For example, the following query would return Bob and Alice, but not Joe:

const query = tt(
    tt.has(
        Employee.Salary,
        tt.greaterThan(900)
    ),
    tt.not(
        tt.has(
            Employee.Name,
            tt.equals("Joe")
        )
    )
);

Full Specification

tt(...) - Starts a new query.

tt.at(0, 1, 2, ...N) - Used at the beginning of a query to identify the nesting level at which to position the cursors. If not specified, cursors are placed on all top-level objects. Can also be used mid-query to only include the cursors that point to types on the specified levels of nesting to be included.

tt.is(Type) - Reduces the set of cursors to online include those that point to types that exactly match the type specified (without considering covariance or contravariance).

tt.has(...) - Reduces the set of cursors to only include those that point to types that match the specified set of operations. A has() operation matches one single content type.

Groupings

tt.not(...) - Negation.

tt.or(...) - Creates a logical OR operation between all passed operations.

Predicates

tt.equals(value: string | number | boolean) - Filter cursors by those whose aliases are equal to the specified value. Inequality is achieved by wrapping this function in a tt.not() call.

tt.greaterThan(value: string | number) - Filter cursors by those whose aliases are greater than the specified value, using a lexicographical comparison. Also worth noting is that "less than or equal" is achieved by wrapping this function in a tt.not() call.

tt.lessThan(value: string | number) - Filter cursors by those whose aliases are less than the specified value, using a lexicographical comparison. Also worth noting is that "greater than or equal" is achieved by wrapping this function in a tt.not() call.

tt.startsWith(value: string | number) - Filter cursors based on those whose aliases start with the specific string or numeric text. Works on properties of all types (not just strings).

tt.endsWith(value: string | number) - Filter cursors based on those whose aliases end with the specific string or numeric text. Works on properties of all types (not just strings).

tt.matches(value: string) - Perform a fuzzy match on the cursors whose aliases roughly match the specified value. This operation uses the full-text engine.

Relocation

tt.container() - Move all cursors to their immediate container.

tt.root() - Move all cursors to their root container.

tt.contents() - Move all cursors to their immediate contents (which likely spawns many new cursors)

Eliminations

tt.slice(start: number, end?: number) - JavaScript-style slicing of the cursor set. Decimal values between 0 and 1 can be used to create a percentage-based slice. For example tt.slice(0, 0.25) would select the top 25% of the result set.

tt.occurences(min: number, max?: number = min) - Filters the cursors to only include the ones that point to types where there are a fixed number of occurrences of the type's values. To achieve an SQL-style .distinct() function, the function call would look like:

tt.aliased() - Includes only the cursors that point to a type that has a value that is a type alias.

tt.leaves() - Includes only the cursors that have no content types.

tt.fresh() - Includes only fresh types.

Ordering

tt.sort(ContentType1, ContentType2, ...ContentTypeN) - Sorts the result set lexicographically by the value of the content type specified in the first parameter. The following parameters are used in the case when two compared values are found to be equal.

tt.reverse() - Reverses the order of the sort. (This is instead of adding an ascending and descending option to the .sort() function.

Future

crosjef commented 5 years ago

The example code here has a really clean split between Truth code and Truth data. If we add an intermediate class that kind of straddles the line, does that have any impact? For example, let's add a Developer class and a Title attribute:

String
/".+" : String

Number
/\d+ : Number

Boolean
true, false : Boolean

Employee
    Name : String
    Salary : Number
    Title : String

Developer : Employee
    Title : "Developer"

E1 : Developer
    Name : "Bob"
    Salary : 1000

E2 : Developer
    Name : "Alice"
    Salary : 1200

E2 : Developer
    Name : "Joe"
    Salary : 1400

Now if I do:

const query = tt(
    tt.has(
        tt.equals("Developer")
    )
)

What would be returned? E1, E2, E2? Developer? All 4?