Open ozra opened 8 years ago
Meta-code cause indentations that has to be compensated for in verbatim-paste-code
Why?
The only conclusion I can arrive at is that in template macros explicit end-tokens will be required, and indentation is not significant.
It's a minor limitation. I have no problem with it.
Template Def Macro - -""- for creating defs.
What?
AST Macros are run macros where you work on the AST and return the result instead of working with a template
I'm afraid you'll have to explain things a bit better. My understanding of the AST is pretty much nonexistent. I don't have a background in compiler design.
Looking back I realize I wouldn't understand that myself, had I not written it.
1 (& 2). Let's take a practical example in crystal code, from the compiler. These are cut out parts of the whole util only to highlight the problems faced, they are part of the functionality for debugging the AST-tree (courtesy of, iirc, @bcardiff):
macro dump_prop(name)
io << "\n" << " " * (level+1) << {{name.stringify}} << ": "
if v = {{name}}
if v.is_a?(Array)
if v.empty?
io << "[]"
end
v.each_with_index do |e, i|
io << "\n" << " " * (level+2) << "[" << i << "]"
e.dump_inspect(io, level + 2)
end
else
v.dump_inspect(io, level + 2)
end
else
io << "nil"
end
end
module Crystal
abstract class ASTNode
macro def dump_inspect(io, level) : Int32
io << "\n" << " " * level << {{@type.name}} #<< '\n'
{% for ivar, i in @type.instance_vars %}
{% unless {
"call": true, # for recursion in Block..
"a_lot_more_here_cut_out_for_the_example"
}[ivar.stringify] %}
dump_prop @{{ivar}}
{% end %}
{% end %}
0
end
end
end
For the first macro, there are no meta-code, so that would be no problem. Moving on to the dump_inspect
macro we have a bit of meta. The {% for...
and % unless...
meta-code is indented, which creates an indent offset. You see dump_prop being used, in verbatim code (to be pasted in to final code) that would end up at 8 spc, while in real Onyx code it should be at 6 spc (aligned with io << "\n"...
above). Of course, simply subtracting the meta-code indentation (at +2) gives 6 - so, as already mentioned, it's arithmetically easily solvable.
Now instead imagine you're generating two different if
-clauses depending on a meta-level condition. Now you have the if
-part indented +2spc in the macro-code, inside the meta-if-condition. But the body (for this example) is invariable, so that will be indented because that's the correct level for verbatim code, thus the if
and its' body will be at the same level of indent, which will look confusing (even though for the compiler it's just one subtraction away. Now, imagine several levels of meta-conditional generation of verbatim-if-conditions and it quickly becomes unmanagable for the brain to keep track of what indents there "really are".
So two options:
I personally think it is clear that this meta-coding-context is too distanced from normal code too be usable with indent-sensitivity.
The macro-meta-code follows indent rules as usual (it's just Onyx-code), but the contents are just treated as "arbitrary" string-data in a indent-vague context, which requires stating the expression-blocks range.
Did this make it clearer?
To reiterate:
template
in Onyx would represent a macro that looks like html-template code (like in crystal)macro
would work with the AST for more powerful macro-work. You could live your life without ever using it, but if you're one of those who find use for it - it's very powerful and in this case also convenient.So the big question still. Actual syntax! C++ is the worst example ever (working with templates), since you can't code macros in C++. Crystal macro syntax is very unclear in my eyes. What it comes down to is simply delimiters - that's all.
Feels like I've forgotten something. Well, it will come around.
macro dump_prop
You know, it's quite the coincidence. I wrote pretty much that exact same function in gorillascript just the other day.
Shame gorillascript is dead. I'll have to translate it back into coffeescript at some point.
Now instead imagine you're generating two different if-clauses depending on a meta-level condition.
Nope, you've lost me again.
In crystal you can define def-macros.
I've never understood those. Could you explain further?
By making a macro that runs an external program, you can pass the args of the macro to that...
Why would I run an external program in a macro?
Did this make it clearer?
Not....as such, no ;)
macro would work with the AST for more powerful macro-work
Sounds very useful. But I don't yet understand it :(
I'd be happy to offer suggestions, once I understood it.
(You could always go with LiveScript [it's alive], you may or may not like it more than coffeescript. It has dash-identifiers, but I don't know much about GS for other similarities. Onyx draws a lot inspiration from LS ;-) )
Nope, you've lost me again.
Contrived psuedo-Onyx code (since there is no macro syntax yet):
template foo(x) =
{! if x.of? StringLiteral !}
if string-specific-check {= x =}
x = string-specific-code-on x -- note that this is indented 3spc (one indent)
-- more than the rest of the body below
{! else !}
if generic-check {= x =}
{! end !}
do-things-inside-the-if-body x -- note that the indent is the same for this as
-- the if-condition-head
right-here
-- this is where a `end`-token would be required if so decided
do-stuff-after-the-if-body
I've never understood those. Could you explain further?
The important difference is in what compile phase the macro is expanded. Macros generally are expanded as "dumb text", well, almost. "def-macros" are expanded at the end of type inference, where all nodes have been typed.
Why would I run an external program in a macro?
Many reasons and different uses. Perhaps you want to set version to a value gotten via git from tags, perhaps you want some data, or a json requested from some web-site freshly updated for each compile, or for this case to extend the compiler and mutate code in a program that returns the mutated code which is pasted in to "this program".
Just keep asking for clarifications as long as I'm not clear enough, I'm not always that good at explaining things.
You could always go with LiveScript
Sadly, I don't like its limitations:
Now instead imagine you're generating two different if-clauses depending on a meta-level condition. Contrived psuedo-Onyx code
I see. You're saying that it's not possible to make the indent level correct for both the macro and the resulting code. And it can't be worked out mathematically because the nesting could be generated in a different place to the indented content - and, indeed, the required nesting could vary depending on previous code generation:
{% if something %}
if a
if b
if c
{% else %}
if d
{% end %}
content -- cannot indent this correctly for both cases
The important difference is in what compile phase the macro is expanded. Macros generally are expanded as "dumb text"
So the first type of macro is expanded at the start of compilation; and the second type at the end, essentially? So the first type can affect the raw text of the code; whilst the latter type can't, but does have more information to work with?
I think I follow you now. Carry on.
(quick OT on LS here:
no way of doing stuff like for k, v in object
Say what? for .. in
- just use for .. of
:
for k,v of object
for own k,v of object
- only hasOwn... properties
no string interpolation
Say what? It's been there forever:
console.log "Hey #{some-var} - I'm interpolatin!"
.
check out www.livescript.net to ctrl-f the facts
end of side-note.)
Good example with the multiple level-if's variability, highlights the problem even better!
Delimiter thoughts:
{%
works for meta-code as a preliminary try out.{{ ... }}
is clear enough though. The throw-up example is one notch better for now ({= ... =}
).%x = 47
. I'd prefer vars are always made fresh unless said to not be with a "global"-like prefix (since they're in the existing scope), something like $$existing-var = 47
. However this might be impossible to solve reliably, considering the no-parens function-calls, so in worst case fresh-prefix must be kept.Syntax suggestions highly welcome.
I don't know what node-pasting is, but {= ... =}
looks fine to me. Looks a bit like the syntax for embedding Ruby in a HAML document.
I agree that variables should be fresh by default. It's just a question of how doable it is. But didn't you say a while back that the design of the language should not be dictated by the difficulty of implementation?
(Looking into LiveScript now. Thanks for the heads-up.)
Pasting is simply taking an argument to the macro and pasting it into the resulting code (though you can run meta code on the arg-node too, for instance do-stuff <{= literal-list-arg.join(",") =}>
, would paste a literal-list arg as a tuple-literal (note: in this example with the newly proposed tuple syntax not yet decided on and implemented - which now is implemented, but will be ditched ;-) ).
Another alternative, perhaps, is double back ticks, it clashes less with regular code:
-- - discarded idea -
-- do-stuff <``literal-list-arg.join(",")``>
meta-code blocks should then mirror and perhaps use:
-- - discarded idea -
-- `% if some-condition %`
-- if true
-- `% else %`
-- if false
-- `% end %`
-- do-stuff
-- end
(No shell-execution code begins with %
)
But didn't you say a while back that the design of the language should not be dictated by the difficulty of implementation?
Indeed, but impossible is a different beast of difficulty, I don't fling that word around lightly ;-)
The backtick operator can be overridden - for example, to yield HTML instead of shell execution - so we can't rely on anything being invalid inside backticks.
Good catch, was a bit quick on that one.
I've spent the last week's Onyx dev time on getting the internals set up for macroing. And continue on with it.
Int
in Onyx becomes StdInt
, if scope is resolved to Program, but remains Int
if defined in other scope (LibC.Int
for instance). Functions/methods are currently babelfished "globally", since it is for de-facto names like "each" (translating to "each_with_index" - this is included in the trial of this because of the additional messyness of "each" being a std-method in itself, to push things to the limit), thereby they should be babelfished as "linguistic terms". If you define "each", it is likely to be a func behaving like stdlib-eaches, and if not, no problem, it works just as it should.{% if... %}
, {= ... =}
and %fresh-var
syntax to begin with, requiring explicit end-tokens, ignoring indent in macros (definitely seems to be the clearer choice in this context).template
variety of macro.So, any bright ideas on macroing is of interest! Should something arise that needs a rethink before going through this monster completely.
Why does each
translate to each_with_index
???
onyx-extended variant where needed
???
each-with-index
an unnecessary inconvenience (just add an arg and there you go!).Some "pseudo-methods and -functions" are also in need of renaming imo. Consider typeof(x)
, x.class
and x.is-a?(Type)
. typeof
would be better off as decltype
/typedecl
or some similar wording, since it gives the declared type (even if inferred, which makes that wording a bit weird though), and class
is not a concept in Onyx, currently, there's just "types". Crystal's "class" is just "reference type", and more importantly, it gives us the specific type a symbol currently holds, so would be better off like of-type
, curr-type
or somewhere along those lines. is-a?
is misleading, since it matches super-types and mixed in traits too, it's currently called of?
in Onyx. 1.of? AnyInt
holds true for instance, as would 1.of? I32 | I64
, or 1.of? Int
which is the type the literal number is given by default (Int
!= Crystal-Int
, the latter being called AnyInt
in Onyx. Likewise Object
is naturally simply called Any
in Onyx.). Then again, I should have left this commentary out of this specific issue :-/
onyx-extended...
Crystal obviously don't know about Onyx specifics, so the Onyxisms not representable in Crystal must be implemented also on "Crystal-side" so that those constructs can be expressed and parsed when expanded in macros written in Crystal. :-O
Simple example: p for v in list: say v
. Extremely contrived, but: p
is a macro deffed in crystal stdlib. for
doesn't exist as a concept in Crystal. In this specific case it's easily solved since for
is re-written to each-looping, and the rewritten nodes are handled fine by crystal, but it highlights the problem in a good way.
2+3: Agreed. Although if each_with_index
is not defined in the object, a bare each
call should presumably just call the underlying each
.
of-type?
sounds good in place of is-a?
. of?
not so much. Suggest it accepts multiple arguments:
of-type?(&obj)(...types) ->
types.any? ~> obj.class == @1 || obj.class.ancestors.include? @1
say "".of-type? Int, String -- true
I still don't understand why Crystal macros need to have access to Onyx features.
Since when is x for y in z: say x
valid code? And what the hell is p
? I've never heard of it. Besides, how can a macro be written that accepts for v in list: say v
as argument!?
p
is like say
or puts
pretty much, but a macro. You could use a plain func say
for instance, to explain that syntax:say for y in [1,2,3]: say y
You call say, with one argument, which is a for
-expression. The for
-expression is executed. It spits out 1, 2, 3 in order. Then it returns the iterated list. Which the first say spits out. Nothing strange.
Output:
1
2
3
[1, 2, 3]
my bad
twitch
Please refrain from using that expression. It's the stupidest and most grammatically-incorrect Americanism I've ever encountered, and is most irritating.
3: But I don't understand why the macro needs to understand Onyx:
values = for v in list: say v
p values -- macro `p` has no need of Onyx knowledge!
Haha, it's terse, I like it. I'll refrain from it to save to you a fit then B-)
my-macro values
is not the same as my-macro for...
. The macro gets the parsed arguments' actual AST-nodes passed in, that's kind of the point of a macro. They are expanded as needed in the macro, forming source code from the source "parts" already in the macro, forming correct (hopefully, if the macro was written right...) code. A for-expression would be parsed to a For-node (before rewrites that is), which is non-existent in Crystal (except, ironically, in macro-meta-code, which is a MacroFor-node).
Hm. So what's the solution?
Simple, just a little tedious: make syntax (can be ugly as fuck, doesn't matter) that won't clash with current (and preferably future, for less refactoring) crystal syntax so that crystal can be made to support the Onyx-constructs. Since this code will never be seen (it's just used for internal rendering/parsing), they can be made very verbose. So, just a bit more of coding to do, hehe. I might take another break from the branch to have a look at implementing the anonymous types issue in a few days.
It's now implemented to the stage where macros can be coded in Onyx (and more powerfully than in Crystal already, since asymmetry in generated 'if's etc. in meta-conditional macro-bodies is possible (as shown in different examples above). It's an early push, and I'll probably find a lot of bugs tomorrow B-)
BTW, regarding the delimiters. They're "good enough", and better delimiters could instead be chosen in the Unicode-realm.
I'm considering introducing a literal, "non hygienic" form of macro into Onyx also.
It would substitute the placeholder everywhere at parse-time, and as such be pretty much as "search n replace" in a text-editor, just barely smarter (avoiding comments and strings at least).
Will issue later on.
Basically a C++ #define
? Are you certain that is wise? They're one of the Three Most Crap Things™ about C++.
(the others being header files and templates)
Haha. Yes, I agree, it is a bad idea.. The idea is very loose so far, and iff needed, then there will be rules on hideously long names to avoid unnecessary mishaps. The only reason is for code-reuse in multiple init() ->
and re-init()
functions without causing "nilable" problems with ivar-type-inference.
Hopefully I figure out some cleaner way of avoiding above problems.
Bringing a shitty solution like this up publicly, can kick off the brain cells to come up with the right solution instead ;-)
Multiple init
?
Say: "reset"-functions with common code that sets ivars to a known state, that may be used from multiple constructors and also for re-initing an instance when using instance-pooling.
I think you'll have to explain this a bit more. It's the first I'm hearing of any of this.
I've figured out what to do about it. The evil macros won't be necessary. Anyway, example of reason:
type Foo
@a I32
@b I32
init() ->
@a = 1
reset
reset() ->
@b = 2
end
f = Foo()
The type inference for constructors is much dumber than the common inference - above won't compile, "@b can be nil"
. I'll hope for some helpful action from the crystal camp here, because it can be solved. I'll issue it there. Otherwise an alternative semantic phase will be added specifically for Onyx.
This is the only (?!) missing feature in Onyx atm, and the culprit is syntax (and a whoooole lot of coding).
Syntax
While indent-significant layout is ultimately apt for coding, it's not so for meta-coding-contexts. Meta-code cause indentations that has to be compensated for in verbatim-paste-code, it quickly becomes un-manageble.
The only conclusion I can arrive at is that in template macros explicit end-tokens will be required, and indentation is not significant. It's not really a normal coding context.
Concept
Suggestions