vaticle / typedb

TypeDB: the polymorphic database powered by types
https://typedb.com
Mozilla Public License 2.0
3.72k stars 337 forks source link

TypeDB 3.0: functions #7038

Open cxdorn opened 2 months ago

cxdorn commented 2 months ago

Problem to Solve

We want to address several shortcomings in TypeQB at the same time:

Current Workaround

Most of the above points are currently impossible to implement in TypeDB.

Proposed Solution

Introduce functions, which may be called from queries and from other functions. In the type-theoretic setting of TypeDB, functions are a natural counterpart to queries that can fulfill the above purposes.

How to define functions

Function keyword

Functions are introduced in define queries using the special syntax define fun, which is followed by the function declaration.

Function declaration

A function declaration comprises:

Here is a full example of a function declaration.

define fun average_salary($company) -> double? :
  match
    $e (employee: $_, company: $company) isa employment;
    $e has base_salary $s;
  return reduce mean $s;

How to use functions

Calling functions in match

Functions can be called from within pattern of a match clause (note: since functions perform data retrievals, there is currently no plan to support functions in other clauses like insert or delete). Function calls pass their arguments in brackets (...) after the function name. By default, arguments are passed based on their position, but labels of variables the function's arguments can used to assign a parameter directly as well using = (e.g. if the signature is my_fun ($param1, $param2) -> output we may call my_fun(param2 = $x, param1 = $y)).

The precise way functions are used depends on their return type.

For example, the avg_salary functions above could be used as follows:

match
  $msft isa company, has name "Microsoft";
  (employee: $e, company: $msft) isa employment_contract, 
     has base_salary > average_salary(company = $msft);
filter $e;

Calling functions in fetch

Functions can similarly be called in a fetch, as along as all arguments are bound:

match
  $x isa company;
fetch
  $c as company: name;
  average_salary : avg_sal($c);

The only functions allowed to be called are functions with printable return type:

Note: the grammar is TBD. Potentially, the following could be allowed:

fetch
  $c as company: name, avg_sal($c);

or

fetch
  $c, $d as company_diff: diff($c, $d);

Calling functions in insert, delete, put

This will not be allowed. Instead, required variables may be assigned in preceding match clauses.

Calling functions recursively

Since in a match query we may call functions, and since functions include a match query, in functions, too, we can call functions. Importantly, functions can call themselves recursively, or can call/be called by other functions in a mutually recursive fashion. There will be guards on recursion for functions containing negations following the idea of stratifications.

Here is a simple example of a simple recursive function:

define fun ancestor($child, $depth : integer = 5) -> {person} :
  match
    $depth >= 0;
    ($child, $parent) isa parentship;
    $a in ancestor(child = $parent, depth = $depth - 1);
  return { $a };

Additional comments

Not addressed in the above are so far:

brettforbes commented 2 months ago

This is brilliant, but can it be extended so Fetch statements can also be bound to a single name and parametrised?

The aim is to collapse a Fetch statement into a single name, with parameters, so it can be used interactively!!!

REASON: Fields like cybersecurity have composite Entities with many optional sub-parts, for example consider the basic File object, with its 10 optional sub-objects.

The strength of TypeQL is that it can handle this intricacy with ease.

The weakness of TypeQL is that you can't easily handle all of these optional variations interactively. Class hierarchies can reduce query complexity in the match statement, but you need to know which optional variations your desired object contains upfront. Fetch can handle all of this, since it is quite clever, but the downside is the statement will be very long, as the variations are described sequentially.

Once query intricacy passes a certain point (e.g. 5-10 lines), it becomes impractical interactively for average users.

What we need is a means of treating the Fetch statement in the same ways as the function, so it can be bound to a single name, parametrised, and used within other statements, particularly if it could be case-specific. The ideal would be the use of the lower case for the top-level entity (e.g. file), and the title case for the composite object's pre-registered Fetch statement (e.g. File).

Then TypeQL could really become very powerful for interactive use, since currently it is too verbose to handle more intricate domains. Currently we have to use a higher level query language for our User Interfaces, and then down-transpile to TypeQL. It would be brilliant if we could surface TypeQL to users with composite objects, bound to single names.

lveillard commented 2 months ago

In case it helps, I packed some ideas around functions in the Functions section in my wishlist for 3.0 https://github.com/vaticle/typedb/issues/6945