Open segeljakt opened 1 year ago
Effects are not something I had considered. I am aware of the Flix programming language (https://flix.dev/) which has a "polymorphic effect system". Flix has a similar goal to Morel - integrating relational algebra and datalog into a functional programming language - but I think that effects are somewhat orthogonal to relational algebra.
It's possible that there are major problems in Morel that can be solved by effects. I have not thought very much about eager vs lazy, for instance. I have also not thought much about provable termination.
I don't see any problems with 'where queries can happen in the code and what code can happen inside a query'. A from
expression can happen anywhere that an expression can happen. Inside a from
expression are clauses such as where
and group
and these contain expressions. These are ordinary expressions, with no particular restrictions, except the usual restriction that they can only reference variables that are in scope. The variables that are in scope in those expressions are determined by the preceding clauses in the pipeline, and the environment surrounding the from
. The scoping rules are very straightforward (especially compared to SQL, where in a GROUP BY
query there are variables that are in scope that you are not allowed to use).
Effects are not something I had considered. I am aware of the Flix programming language (https://flix.dev/) which has a "polymorphic effect system". Flix has a similar goal to Morel - integrating relational algebra and datalog into a functional programming language - but I think that effects are somewhat orthogonal to relational algebra.
Interesting, I did not know about Flix. I will read up on it. Thanks 👍
I don't see any problems with 'where queries can happen in the code and what code can happen inside a query'. A
from
expression can happen anywhere that an expression can happen. Inside afrom
expression are clauses such aswhere
andgroup
and these contain expressions. These are ordinary expressions, with no particular restrictions, except the usual restriction that they can only reference variables that are in scope. The variables that are in scope in those expressions are determined by the preceding clauses in the pipeline, and the environment surrounding thefrom
. The scoping rules are very straightforward (especially compared to SQL, where in aGROUP BY
query there are variables that are in scope that you are not allowed to use).
I meant, for example, if one of the expressions inside a query clause contains imperative code, such as mutating a reference, it would probably make it impossible to optimize the query without changing its behavior.
Ah, I see. Yes, there several differences in assumptions between functional programs and SQL:
AND
, OR
and functions, which makes a difference if there are exceptions; I'm believe it respects the order of CASE
, COALSESCE
;WITH
clause) twice in the same query, the results are the same. This ensures a degree of determinism even if the query calls, say, the RANDOM
function. (If the CTE references correlating variables the determinism guarantee disappears, or is modified, I'm not sure.)I can see how effects systems could bridge between those differences. But before introducing effects, I'd like to see how far we can get staying within a pure, eager, strongly typed language with polymorphism and an assumption that the compiler aggressively inlines expressions.
Interesting list. Will Morel provide syntax for WITH
, RANDOM
, AND
, etc., or is that handled by the functional paradigm? 🤔 I can see WITH
being like a recursive function.
With effects, I meant only that they can be a solution to to verify such rules. I am not so interested in the "effect handlers" that some languages are hyping up right now 😅
SQL AND
is Morel andalso
.
There could be a random
function and either it would be non-deterministic (i.e. not based on a seed) or it would have internal state.
Also, at some point we will allow UDFs written in Java, so of course you could write an impure random function.
SQL's WITH
is analogous to Morel's let
. If the (relation) value defined in the let
is referenced twice, you should not be able to tell whether it is computed once or twice. (If you use a function with side effects, then you will be able to tell, but Morel doesn't guarantee what will happen.)
Have you thought about adding an effect system to Morel?
It requires some extensions to SML, but I think it could solve problems related to restricting where queries can happen in the code and what code can happen inside a query.