Open AndreVanDelft opened 10 years ago
The work needs to be done here:
Can you please elaborate more precisely on the point (1) - that is, exact requirements to the return values? Then I'll try to cope up with implementation ideas.
The result type is a parameter to the Script type. E.g.,
type MyScriptType = Int => Script[Int]
script..
a:Int = {0}
A script result value is assigned using the ^ postfix operator attached to either a code fragment or script/method call; alternatively one may use the variable named "$".
If there is only 1 operand in the script body (presumably a code fragment or a script/method call), then the ^ postfix operator may be left out. Maybe this feature is not wise; there is no way to say in this case that the defined script will not result anything at all.
E.g.,
script..
a:Int = {0} // becomes 0
b:String = {0}; {""}^ // becomes ""
c:String = {0}; {$=""} // becomes ""; no result type inference
The first example means to express that
It is possible to capture the result of a callto a script or method in a variable of the type $number and $name, using the "^" operator. If no such operator is specified then the result of tall call to the script or method with a given name (as appearing in the text) becomes available as $name. E.g.,
script..
a = aCall; print($aCall)
b = aCall^A; print(s"\$A=${$A}")
c = aCall^7; print(s"\$7=${$7}")
d = {1}^7; print($7)
Note: there should be no space between "^" and its right-hand side. A special case is when no such a right-hand side exists; the result of the call then goes into "$", i.e. the result variable for the calling script. The "$" symbol was taken because it also appears in YACC. This new special meaning of "$" clashes with the meaning of "$" in interpolated strings. Maybe it is better to drop "$" in favour of "^".
E.g.,
script..
a = {0} // inferred: Int
b = {0}; {""}^ // inferred: String
c = {0}; {""} // nothing (Unit)
d = {0}^; {""}^ // inferred: Any
(Work in progress)
Currently there is a type Script, which is returned by the method DSL._script. This should get a type parameter, that specifies the script result type. For scripts without a result, the type parameter should be Unit. It would be nice if Script = Script[Unit] but FTTB there are no default type parameters in Scala. We could bypass the problem by letting the compiler append "[Unit]" to "Script" in case it appears without a type parameter.
Variable names starting with a $ are now possible; however these must be defined by an enclosing script. Maybe the Scanner and the Parser need to be changed for this.
Scripts store their $ variables locally, as if they are local variables. The result value is transferred to the caller for capturing whenever they have success, i.e. at the same moments that ?parameters would be transferred.
Wouldn't it be more natural if we just use local variables to store result values? For example:
def script..
foo: Unit = var x: Int = bar ; {println(x)}
bar: Int = {0}^ & {println("returned 0")}
In case of ordinary functions, for example, it would be rather inconvenient to work with them if they automatically assigned their return types to specifically named variables:
def currentTime: Long = System.currentTimeMillis()
def foo = {
currentTime()^1
currentTime()^2
println($2 - $1)
}
In is unnatural and inconvenient, user would prefer something like:
def foo = {
val t1 = currentTime()
val t2 = currentTime()
println(t2 - t1)
}
That is, all the assignments should be treated uniformly. If a result of a function is assigned in a "a = f(x)" manner, then script results also should be assigned in a similar way, because otherwise it'll make the code less readable, will make learning SubScript more difficult and generally add some complexity to it.
Very good to think about such aspects ant to discuss these before impelenting. However, I slightly disagree, for several reasons:
x^1 & y^2; print($1+$2)
var v1=0; var v2=0; (v1=x)&(v2=y); print(v1+v2)
OTOH, the "var" variant allows for specifying a type explicitly. Maybe we should support something like that: "bar^x:Int. So I really want the "^" result capturing. However, if the "var" syntax would be easy to implement (I think so) then that would be nice to have as well; may the fittest survive.
Please note I have updated my previous comment, in the section "Result capturing".
At the runtime, we have no such notion as script, everything is represented by call graph and template tree. Even script calls are nodes in the graph. While convenient in high-level discussions, it's inconvenient to talk in terms of script result values during actual implementation. Instead, we should use "node result values" term in such discussions. What precisely should we understand under it? Well, in your examples even atomic normal code node could have a result value:
{0}^foo
therefore, we should assume that every node should be capable of having some result value.
Bottom line: every node can have result value.
There are nodes, for which it is relatively easy to determine result value. For example, most of atomic nodes, like normal code node:
{
println("Hello world")
3
}
Here, we can just apply standard Scala rules to determine the value of the code block and assume this to be the result value of the normal code node. However, there are more tricky nodes, like n-ary operators:
a & b & c
a = {3}
b = {"foo", 10}
c = {12, "bar"}
Here, we can't determine result of "&" without some thinking. In general, different flavours of n-ary nodes may exist, and they may require completely different strategies to determine the result value. Therefore, the most efficient way of capturing this undefined nature of strategy may be simply to define this strategy as an abstract method in a CallGraphNodeTrait:
def result: T
and let children nodes care about the implementation.
Bottom line: result computation strategy should be left as an abstract method and needs to be implemented for different nodes individually.
The majority of n-ary nodes, however, won't require too tricky logic to evaluate their result. Most of them would in fact reuse result of their children to determine their own result. Examples include if-else, ";", "+" and others. Therefore, efficient though unobtrusive way of making this process easier should be defined. A nice way of doing this may involve capability to mark children nodes with flags - escalation flags - that would be used by their parents during their result computation to "navigate" through their children in a most primitive way. For example, here's how ";" may determine it's result value based on presumption that only one child of it is marked with "escalate" flag:
trait CallGraphNodeTrait[T] {
...
def escalate: Boolean
def result: T
}
class N_ary_op[T] {
def result =
children.filter(_.hasSuccess). // Only successful nodes
filter(_.escalate). // Only marked for escalation
head.result // Take head, or fail with an exception if there's no such
Bottom line: nodes can be marked with escalation flag for ancestor n-ary nodes to be able to navigate through them.
First, the fact that the node can have a result should be defined; also undefined nature of the result computation strategy should be defined. Next, the fact that this result can potentially be bound (or not - that's why we should use Option) to some local variable should be reflected. Finally, escalation flag should be defined in the most trivial and straightforward fashion.
trait CallGraphNodeTrait[T] {
def result: T
var resultVariable: Option[LocalVariable[T]] = None
var escalate: Boolean = false
}
For other nodes, we'll have to implement the result computation strategy. For atoms this should be trivial:
var result: T
and then in CodeExecutor set this var to some concrete value once atom code computed (to the result of this atom's code execution, obviously). For other nodes this may be not so trivial. For example, in case of ";" escalation flags should be used:
def result: T = children.filter {x => x.hasSuccess && x.escalate}.head.result
In case of "&", "||" or others other strategies might be desired.
On node activation, we should check whether its resultVariable option is defined - in this case, we assume this node's result should be bound to a variable. If it's not defined, we assume the opposite. We take this variable, get its name and define corresponding variable in the nearest n-ary ancestor:
resultVariable match {
case Some(LocalVariable(name)) => node.n_ary_op_ancestor.initLocalVariable(name, node.pass, ???) // notice: null can't be used instead of '???', since upper bound is Any. Further thinking is needed to come up with appropriate initialization strategy.
case None =>
}
On success, we assume that the node that succeeded is already ready to present its result. So we look at whether we need it (to bound to the local variable) or not and, if we do, we compute it and we use it.
resultVariable match {
case Some(v) => v.at(node).value = node.result
case None =>
}
Due to escalation flag introduction, I propose to change the syntax the following way:
Very interesting; this way we would get more and more towards a language that manipulates data on a high level...but I am not entirely convinced yet.
A nice way of doing this may involve capability to mark children nodes with flags - escalation flags - that would be used by their parents during their result computation to "navigate" through their children in a most primitive way.
What should be done with the results of child nodes that have already deactivated? How should that be implemented. As a rule of thumb: if you can define some simple rules that are easy to implement, then it is often explainable and useful.
Parser node^name^ - escalate, bind result to local variable named name Implement that like everything in the parser so far.
Do you have a use case for this? If not, I think we should not yet support this, FTTB.
I miss support for setting the result value of the current script. So why not add the following: node^^ - bind result value to the local value for the result of the enclosing script
Bottom line: every node can have result value.
I am not sure how to define this for parallel operators, but even if it would not be nicely possible, then we may define some rules that still would be worthwhile for most other node types.
Before this would be implemented though, I would like to see quite a few use cases:
footnote(??n: Int): String = if (fnFormat== NUMBER_DOT ) (??n ".")
else if (fnFormat==PARENTHESIZED_NUMBER_DASH) (footnoteRef,??n "-")
; line^^
; .. (line ==> {: $ += _.trim :})
You're right, deactivated nodes will be out of reach. I think, we can just make a callback, onChildSuccess in the n-ary node and call it each time some child has success. Child nodes' results will be accumulated using this method. For instance, for ";" we can use something as follows:
var childrenResults: List[(Boolean, T)] = List() // (escalate, result)
def onChildSuccess(c: CallGraphNodeTrait) = childrenResults ::= (c.escalate, c.result)
def result = childrenResults.filter {case (escalate, _) => escalate}.map {case (_, result) => result}.head
Use case for node^name^:
(if (expression) {0} else {1})^result^ ; {println("LOG: returned " + $result + " from ; node")}
node^^ - bind result value to the local value for the result of the enclosing script
I don't quite understand. Do you mean to bind result to $node?
On the question of examples of Scala Workshop: can you please specify more precisely? As far as I have seen, nothing there have involved result values. Have I missed something?
I don't quite understand. Do you mean to bind result to $node? No; to $; i.e. the result of the script that this code appears in.
The Scala Workshop paper is "Dataflow Constructs for a Language Extension Based on the Algebra of Communicating Processes". The result values are mentioned in the abstract and elaborated on page 6.
This calculator built with parser combinators would be a good use case.
To add result and failure values to scripts we could easily generate some additional code for scripts. E.g., for a script
test(i:Int) = print,"Hello" println,i
we currently generate code like this:
def _test(i:Int) =
_script(this, 'test) {
_seq({print("Hello ")}, {println(i) })
}
And this could become:
def _test(i:Int) =
var result: Int
var failure: Throwable
_script(this, 'test) {
_seq({print("Hello ")}, {println(i) })
}
There are two problems with this:
These may be solved by creating a class for scripts. I am thinking of
import scala.language.reflectiveCalls
abstract class Script[R](_owner: AnyRef, _name: Symbol) extends N_script {
var result : R = _
var failure : Throwable = null
}
Usage:
def test(i:Int)(_c: N_call) = new Script[Int](_owner=this, _name = `test) {
val template = _seq({print("Hello ")}, {result=i; println(i) })
}
_name
, `_templateand
_owner`` start with underscores so that these clearly belong to the enclosing script.
In the latter example we can access result
between the braces because we are subclassing.
In the currently generated code we cannot do that, because the braces form a parameter to the function _script
, so result
is not brought into scope.
But I would like not to touch the current code generator much for two reasons: it would take time, and the current solution is clear and simple.
I was wondering: can we turn _script
into a macro, so that it effectively would transform itself in the new Script[] { ... }
code that brings result
into scope?
In that case we would only need to add the type parameters to the generated code.
I experimented a bit with the macros but I did not manage to bring result
and failure
in context this way. The macro call requires that the actual parameters are well typed expressions before the macro is called.
There would also be another problem with the Script class approach (without macros): this
and its features will point to the current script rather than the current object, in contrast to the programmer's expectation.
It may be possible to rewrite "this" using a macro: http://meta.plasm.us/posts/2013/08/31/feeding-our-vampires/ but I think this will be quite complicated.
I will therefore add the result
and failure
fields in the compiled code.
Why do we need to put result
and failure
there? Wouldn't it be more intuitive to put these variables into the Call Graph nodes classes? What is the advantage?
Also, wouldn't it be more rational to use Try[T] instead of result
and failure
, so that we can represent the result of computation with either Success, or Failure. Or null
, if it didn't terminate at all.
Inside scripts several new features become available, accessible from Scala code, e.g. in code fragments and if-conditions:
script
- the „here” of the current script$
- a Try for result&failure values of script
$result
- the Success part of $
$failure
- the Failure part of $
The here
of a code fragments will also have a private result/failure value.
It may be accessed as here.$
.
Note that there is no big need to get easier access, since the result of a code fragment becomes just the value of its executed Scala block.
The failure may be set using a call to a method here.fail(failureDescription:String)
. Reading it would be no use.
The new fields do not require compiler changes; they will be supported by
ScriptResultHolder
Script
classCallGraphNode.Child
and Script
subscript.Predef
script
has type Script[R]
for some R. This means R must be known in the context. Probably all concrete node classes for the template tree and for the call graph will therefore need such a parameter, and DSL methods need to have those too. This may be the biggest challenge of the result/failure support operation.
Note: Since results are now in a Try, numeric results are not initialized to 0. This has to be done manually.
ScriptResultHolder
The $
feature for results and failures is available from this new trait:
trait ScriptResultHolder[R] {var $:Try[R] = null}
Script
The script
field is, like here
, a node in the call graph. It has a special class: Script.
It has a 1 to 1 relationship to its template.
case class Script[R](template: TemplateNode.Child, p: FormalParameter[_]*) extends N_script[R] with ScriptResultHolder[R] {
def script = this
}
N_code_fragment
N_atomic_action
is renamed to N_code_fragment
, since {!
…!}
does not mark an atomic action.
N_code_fragment gets a type parameter:
trait N_code_fragment[Node,R] extends CallGraphLeafNode with ScriptResultHolder[R] {
The _script method in DSL returns a new instance of a Script node, with an equally newly generated template:
def _script[R](owner:AnyRef, name:String, childTemplate: TemplateNode.Child, p: FormalParameter[_]*) = {
val template = T_script(owner, "script", name, childTemplate)
new Script[R](template, p:_*)
}
subscript.Predef
allows for convenient access to the script result variable:
def $ [R]: Try[R] (implicit s: Script[R]) = s.$
def $result [R] (implicit s: Script[R]) = s.$.asInstanceOf[Success[R]]
def $failure [R] (implicit s: Script[R]) = s.$.asInstanceOf[Failure]
def $_= [R](v:Int, implicit s: Script[R]) = s.$=v
def $result_= [R](v:Int, implicit s: Script[R]) = s.$=Success(v)
def $failure_=[R](v:Int, implicit s: Script[R]) = s.$=Failure(v)
T_call gets type parameter R.
I like this idea of the script
variable: it makes result determination more flexible and developer-friendly, without hard-coding any logic into the VM core classes. One can just write script.$ = Success(n)
from his/her code to set the result, very nice.
How does SubScript compiler know the difference between "." as a break operand and "." as an object-oriented path separator in case of script
intensive usage?
(Boldface marks answers by AvD): If there is white space before the ".", or if it cannot be a path separator at all, then it is a break operand. A similar rule holds for parentheses.
Also, maybe, a better way of doing things is just to make here
accessible at places where script
is supposed to be accessible? script
points to the Script
node, here
points to the current node, but from the Scala code context rather then from a script, so there will be no naming collision. An advantage of doing things this way is that this is more intuitive for the user. It is not very conveniently to remember a whole bunch of new keywords to use SubScript.
"here" is only (or very mainly) available in code fragments, script calls, if conditions, while conditions, annotations. The latter has also a "there" value. These are the only fixed "keywords"; the rest is rather flexibly defined in Predef.
Also, a "script" (not a variable, but a "script") is rather artificial notion on the graph level: we declare a certain region of a graph (I believe, all the nodes located under a Script
node) to be a "script", give it some special properties. But a reasonable question arises: if this region has "scriptic" properties (I believe, the only such property is access to the nearest Script
ancestor on demand and setting its result), why some other bunch of nodes (or an individual node) can't have these properties (access its direct parent (not Script
) and set its result)? If Script
can have result values, why no other node can have such? If done this way, in some time we can start thinking about "anonymous scripts" to make ";" or "&" have a result value (for some reason), because normally it can't have such because it doesn't have it's own Script
as a direct ancestor. Adding new concepts without a reasonable need is not a good thing.
Script lambda's become scripts, so they have their own result values. If there would appear a use to give a region an option for its own result value, then we might create a lambda for it by enclosing it in brackets "[".... "]".
In my opinion, a good way of doing things would be to mix the ResultHolder
trait to the CallGraphNode
trait, so that every node can have a result, not only scripts - this way, on the graph level there are no "privileged" bunches of nodes - "scripts" - and every node has same result-capable properties.
FTTB we must get something simple working soon; when a first implementation is ready we can experiment with use cases and see if we want something more general like what you describe here.
And, I don't think it is a good thing to expose the graph and all other internal machinery to the end user. Graph is just an implementation of the idea, it is not a good thing to expose it to the end-user API.
This is not an urgent issue. We can make features package-private later.
Can you please clarify on the need of the Script
class? In your architecture, the only difference of it from the N_call
class is that the N_call
node knows only the name of a function that will yield its template, as opposed to the Script
class which accepts its template on construction time. But I can't see how will it help in case of script result values.
Though, the Script
class has advantage compared to the N_call
class. A class that has a ready template is always more intuitive then a class that only has a symbol of a template. Maybe it is a good idea to replace N_call
class with Script
class?
N_call is a caller, not a callee, so the result value does not belong there. Besides the solution should also work for communicating scripts: a,b={}. There are multiple callers and a single callee. The latter should carry the unique result. Maybe the current design is far from perfect, but it is important to get something working soon with the intention to enhance later. Otherwise we will suffer analysis paralysis.
Also, I suppose, $
should be Try[R]
, not R
, because you say at the very beginning that it is a Try.
Yes, thanks, I saw that too, when I started to code.
I'm finally getting convinced. So we make a special node, Script, that will be responsible for result values of a certain graph region. We manipulate the result of this Script node from its script body. Yes, sounds nice. I think, we can do that and see how it behaves in various use cases.
However, an issue of type inference arises. If the programmer decides by himself what the result would be, then we have to teach the compiler to find the script.$ = Success(foo)
constructs and infer the script type from foo
's type.
Reply: That will not be an issue. We do not do that kind of interference; only for implicit and explicit ^
occurrences.
My previous big posting is already outdated; during implementation some things came up that required changes. The main change is that the script
value in scripts is now implemented by supplying to the DSL._script
method a parameter childTemplateAt: Script[R] => TemplateNode.child
. The actual parameter there (generated by the SubScript compiler) has this script
name; it will get the value of a new Script
instance, and then produces a child template for that Script
. This mechanism is partly comparable with the way here
and there
are brought into scope in code fragments, annotations etc, but maybe even more complicated.
The 'old' idea of my previous posting, that gave class Script a script
value member, does not work, since accessing it using Predef features from the varous nodes would imply that all these nodes would need an extra type parameter, for the Script's result type.
Inside scripts several new features become available, accessible from Scala code, e.g. in code fragments and if-conditions:
script
- the „here” of the current script$
- a Try for result&failure values of script
$result
- the Success part of $
$failure
- the Failure part of $
The here
of a code fragments will also have a private result/failure value.
It may be accessed as here.$
. Note that there is no big need to get easier access to that value, since the result of a code fragment becomes just the value of its executed Scala block.
The failure of here
may be set using a call to a method here.fail(failureDescription:String)
. Reading it would be quite useless.
The new fields require compiler changes; apart from the already discussed script
parameter, all node types that can produce values of various types need type parameters for those: code fragments, script calls, maybe later also if-else-expressions and do-expressions. Making the compiler provide appropriate type parameters is tedious, so FTTB we may implement this just rudimentarily: code fragments get type Any
; for script calls we can hopefully do better.
Also there will be support by
ScriptResultHolder
Script
classCallGraphNode.Child
and Script
subscript.Predef
script
has type Script[R]
for some R
. For the time being, it is the type of the script if that has been explicitly been provided in the declaration; else it is just Any
. Later we can hopefully infer the result type from ^
result specifiers.
Note: Since results are now encapsulated in a Try, numeric script results are not initialized to 0. This has to be done manually.
ScriptResultHolder
The $
feature for results and failures is available from this new trait:
trait ScriptResultHolder[R] {var $:Try[R] = null}
Script
Instances of class Script have a 1-to-1 relation with their respective templates.
case class Script[R](var template: T_script, p: FormalParameter[_]*)
extends CallGraphTreeNode with ScriptResultHolder[R]
{type T = T_script}
T_code_fragment
N_code_fragment
T_atomic_action
and N_atomic_action
are renamed to T_code_fragment`` and
N_code_fragment, since
{!…
!}``` does not mark an atomic action.
N_code_fragment gets a type parameter:
trait TemplateCodeHolder[R,N] extends TemplateNode {val code: N => R}
trait T_code_fragment[R,N<:N_code_fragment[R]] extends T_0_ary with TemplateCodeHolder[R,N]
trait N_code_fragment[R] extends CallGraphLeafNode with ScriptResultHolder[R] {
type T <: T_code_fragment[R,_]
...
}
The _script method in DSL returns a new instance of a Script node. This node is brought under the name of script
into the scope of its template code. For this purpose the DSL._script
method accepts a parameter childTemplateAt: Script[S]=>TemplateNode.Child
.
First a preliminary template is created for the Script without the child template yet. Then the Script is created using that template. Then the child template is created using the passed childTemplateAt
method and the created Script. Then this child template is connected to the script template.
def _script[S](owner:AnyRef, name:Symbol, p: FormalParameter[_]*)(childTemplateAt: Script[S]=>TemplateNode.Child): Script[S] = {
val template = T_script(owner, "script", name, child0=null)
val result = new Script[S](template, p:_*)
val childTemplate = childTemplateAt(result)
template.setChild(childTemplate)
result
}
subscript.Predef
allows for convenient access to the script result variable:
def $ [R] (implicit s: Script[R]): Try[R] = s.$
def $result [R] (implicit s: Script[R]): R = s.$.asInstanceOf[Success[R]].value
def $failure [R] (implicit s: Script[R]): Throwable = {val f=s.$.asInstanceOf[Failure[R]]
if(f==null)null else f.exception}
def $_= [R] (v: Try[R] )(implicit s: Script[R]) = s.$=v
def $result_= [R] (v: R )(implicit s: Script[R]) = s.$=Success(v)
def $failure_=[R] (v: Throwable)(implicit s: Script[R]) = s.$=Failure(v)
Many classes for template nodes, call graph nodes, script executors and code executors get a type parameter R for script results and node results.
To complete the previous comment, a typical call to DSL._script
is
def _times(n:Int) = {_script(this,'times) {(script:Script[Unit]) => _while{implicit here=>pass<n}}}
this would be equivalent to
def script times(n:Int) = while(here.pass<n)
Changed the terminology: result values instead of return values