Open ChengCat opened 5 years ago
Thanks for this. The evaluation of each of the (m)
calls is repeated because there is one cycle in order to resolve the three-parameter +
, and then another cycle for the actual two-parameter +
function calls. In steps:
(+ (m) (m) (m))
.
int
s).(+ int int int)
: returns the +
macro from the operator macros library.(+ (+ (m) (m)) (m))
.(m)
.The first three evaluation of (m)
calls are not necessary. We can determine to dispatch to the +
macro from operator macros library only by looking at the number of arguments of +
, without evaluating types of each argument.
By reducing macro evaluations, we can:
I think the current approach of evaluating types at each dispatch also has its advantage. The rule of dispatching is probably simpler, and more transparent.
The first three evaluation of (m) calls are not necessary. We can determine to dispatch to the + macro from operator macros library only by looking at the number of arguments of +, without evaluating types of each argument.
This is true, but per your last comment, the current approach is used because it's simpler and more transparent. The only exception to the rule is for macros with names that are not overloaded at all (e.g. let
), because the compilation time cost of re-evaluation in many of those cases is very high.
The only exception to the rule is for macros with names that are not overloaded at all (e.g. let), because the compilation time cost of re-evaluation in many of those cases is very high.
I am concerned that, when the macro/function overloading gets used more in a program, the compilation time cost could also be very high. Currently, such overloading is not used much in the standard library, so this issue may not be so obvious.
What about this: we try our best to determine the dispatch without multiple macro evaluation, and when that fails, we abort with a compilation error. This approach is understandable to the programmer, but it is not so transparent. Whether a program compiles or not depends on the compiler implementation. This approach is extensively used by Rust, but I am not so fond of it. However, I dislike multiple macro evaluation more.
Regarding the typed macros, I have just come up with a new idea. For typed macro arguments, pass the fully expanded form as the argument value, rather than the original form to be expanded again. To justify this, it seems to me that, to determine the type of argument, its semantics have to be determined first. And therefore, it seems to me that the argument should be passed in as a semantic value rather than a piece of syntax. And the closest thing we can do in Dale, is to pass the fully expanded form in.
Also, type-of
should return both type and fully expanded form of the input form.
With the above considerations, I think the issue of #200 is satisfactorily addressed.
One of the problems of the typed macros also is following: You don't even need to use the form, that returns one type. So passing the evaluated macro in such a case really seems reasonable.
Dispatching on argument length before type dispatch also seems like a good idea.
Only one form? -> Function?
Typed versions exist for argument length? Evaluate all forms, pass expanded forms to macro or function, where types are required
Only one untyped macro exists for argument length?
Don't try to evaluate any argument at all (probably better, when you have to specify not to typecheck at all ((attr notype)
))
This way, the only way to evaluate a macro, which is a macro argument, multiple times is by not specifying the argument type for one field, but still just evaluate it in the macro itself. So all arguments get evaluated once, but when you don't specify the arguments to be typed, you get the unexpanded version of that macro. The untyped versions are intended as arguments, which should be transformed before expanding them.
@porky11 Glad to see your supportive comment again :)
My idea has evolved a bit, and I want to make some clarifications about my current idea.
My current idea is to:
When the compiler cannot guarantee the above conditions for a set of overloaded functions/macros, the compiler aborts with a compile error. The compiler uses a potentially complicated algorithm as dispatch strategy, and the programmer would consider the algorithm as an opaque compiler implementation detail. I can try to devise a specific algorithm, if the overall direction is good.
This will impose some limitations on how function/macro overload can be used in a program, but I feel that with a smart algorithm, the limitation would be small in practice.
Lastly, we also return the fully expanded form from type-of
(and any similar functions), so that multiple macro evaluation can be avoided even when the programmer explicitly requests for type information.
I can see one potential issue with my idea. It's possible that the fully expanded form of typed macro arguments to be interpreted as a different semantics in the macro output, and thus breaking the idea of passing a semantic value in. I currently think this flaw is acceptable.
@tomhrr What's your opinion with this? You don't have to do the compiler implementation, and I can help with that.
I think this is a reasonable idea. To make sure I understand properly, this is how I think it could be implemented:
Does this look OK?
Glad to see you agree with this idea :D
Do not allow macros to be defined with an untyped parameter at a given position if there is an existing function or macro that has a typed parameter at that position (and vice-versa).
There're two major exceptions that need to be handled. Typed and untyped parameter can coexist at the same position, if:
Consider the following examples:
(def example1 (macro intern (a b)))
(def example1 (fn intern void ((a int))))
(def example2 (macro intern (a (b float))))
(def example2 (fn intern void ((a int) (b int))))
For a systematic algorithm to check whether a given set of macro/function is dispatchable, I sketch a pseudo-code as below:
procedure check-dispatchable(macro-function-set)
let max-number-of-arguments = max { number-of-argument(f) | f in macro-function-set }
for i = 0 .. (max-number-of-arguments + 1) ;; the range is inclusive
let candidate-set = { f | f in macro-function-set,
and f can used with i arguments }
check-candidate-set(i, candidate-set, {})
end
end
procedure check-candidate-set(num-arguments, candidate-set, evaluated-parameter-positions)
;; 1. try differentiate by evaluating the type of one argument
for i = 0 .. (num-arguments - 1) ;; the range is inclusive
if i not in evaluated-parameter-positions
and effective-argument-type(f, i) is Typed for all f in candidate-set then
let grouped-candidate-sets = { {candidate | candidate in candidate-set
and (effective-argument-type(candidate, i) == Typed(type)
or effective-argument-type(candidate, i) == special-wildcard-type) }
| Typed(type) == effective-argument-type(f, i) }
if sizeof(grouped-candidate-sets) > 1 then
foreach group in grouped-candidate-sets
check-candidate-set(num-arguments, group, evaluated-parameter-positions + {i})
end
return
end
end
end
;; 2. now candidate-set is not differentiable
for i = 0 .. (num-arguments - 1) ;; the range is inclusive
if there exists f1, f2 in candidate-set, such that effective-argument-type(f1, i) is Typed
and effective-argument-type(f2, i) is Untyped then
report that this candidate-set is not dispatchable
end
end
;; 3. now this candidate-set is dispatchable, with each argument position being
;; either typed in all candidates or untyped in all candidates. Use the
;; current design of dispatch order to dispatch to one of the candidates.
end
;; return `Typed(<type>)` or `Untyped()`
function effective-argument-type(macro-or-function, argument-position)
if argument-position < number-of-argument(macro-or-function) then
return argument-type(macro-or-function, argument-position)
elseif macro-or-function is a vararg macro then
return Untyped()
elseif macro-or-function is a vararg function then
return Typed(special-wildcard-type)
end
end
;; return int
function number-of-argument(macro-or-function)
return the number of arguments of f, except varargs (and
also exclude the 'rest' parameter of macro)
end
EDIT 1: improve how special-wildcard-type
is handled.
There're two major exceptions that need to be handled. Typed and untyped parameter can coexist at the same position, if:
the two macros/functions take different number of arguments; or the two macros/functions can be differentiated by another typed parameter.
Yep, agreed for both of these.
I have tried to implement this, and found a problem. Currently, there's absolutely no way to obtain the fully expanded form. The most sensible way to address this seems to include the fully expanded form in ParseResult
, which already contains a semantic value represented in llvm::Value
, and the corresponding Dale type. Since ParseResult
is used across major part of the code base, the work seems to be too overwhelming for me alone.
I have also found an unanticipated issue in the interaction between namespaces and overloaded functions, but that issue is not difficult to address with some small changes to the above pseudo-code.
OK, thanks for looking into this. I will see about including the expansion in the ParseResult
.
Macro
m
is evaluated 6 times during compilation, while I expectm
to be evaluated only 3 times. If I am not mistaken, in the context+
is overloaded as follows: