Functions & sloppy "inlining"

Functions are one of the very cases - if not the only one - where Arturo has very tight scoping.

In a few words: any value altered from within a function will not persist, after the function is over. Nor will any new symbol that is defined within it.

a: 1
f: function [x][
    a: 2
    b: 3
]
f "something"
; `a` is still 1
; `b` doesn't exist here

That obviously comes at a price: every time we create a "scope" (at least in the horrible(?) way I've done it so far lol), this leads to far worse performance.

So... for code that is critical and we need a lot of performance (e.g. functions that are to be called thousands or millions of times, recursive functions, etc), it would be great if we get rid of this limitation -- if possible.

And that's how .inline was born. Although I haven't really used it that much, it's an option we may add to a function so that it doesn't have a scope. One obvious reason to use it - and I mean: use it explicitly - would be the case of our previous option .exportable. That is: export all symbols to the outer scope. Why? Because... we may need it.

But that aside, there is another, more obscure side to this same feature: to boost performance.

And you would be thinking that unless you explicitly declare a function like this... e.g. function [x].inline [...], you should be sure that it's a fully scoped function, as usual. Right? Well,... wrong. ⚠️

The truth is the VM tries to be a bit smarter (than it should - most likely) trying to figure out which functions could be considered implicitly .inline. In a few words: is there any way to figure out if a function doesn't need to have a scope created, just "by looking at it"?

My personal bet was :labels. That is: we're looking into a function's body (recursively, if there are subblocks) and once we spot even one label, then the function is marked as non-inlineable (that is: go and create a scope as usual). And the weird logic is that if there are no labels, there should be no new symbols defined in there and thus... why create a scope?


f: function [x][
   print "hello"
]

f "something"
; here you may think that `f` has its own scope
; but in fact it doesn't!
; now: does it matter in that case? 
; absolutely not 

; and no, `x` doesn't go into the equation;
; parameters are always scoped ;-)

But the time has come where the "plan" backfired. What if:

we define a symbol within the body of the function with let?
what if we use our new export stdlib function?

Both functions do define/change symbols in the function scope... and - if there was no other label and the function ended up being considered "scopeless" by the VM... automatically - that means that all symbols that were defined in there will simply leak.

a: 1
f: function [x][
    let 'a 2
    let 'b 3
]
f "something"
; `a` is now 2!
; and `b` is... set to 3
; (everything leaking!)

[!TIP] An "obvious" hack here is to forcefully add just one label (e.g. zxczczx: ø) somewhere in the body of a function, to make sure that it won't be scopeless. But obviously, this is ridiculous. I'm just mentioning it here as what it is: a hack.

So... how do we really deal with this?

The obvious way would be to make scoping fast enough so that don't bother whether a function has a scope or not.
The other way would be to add another explicit option to function (pretty much like .inline) which could force a scope, no matter what (e.g. scoped) -- although it's still hackish. On the other hand, unless sb uses things like let or export (which is a different level of use, already - not the average person), I guess we are already talking about a more advanced usage that could imply some knowledge of such corner cases
Another approach (related, or not-so-related): I've been thinking of introducing a new stdlib function that converts a block to "scoped" (e.g. simply do scope [ ... ]) and that would add a tiny switch to a block - and then do, for example, would treat it properly by creating a scope and destroying it after its execution.

Not an easy issue to tackle. But I'm still mentioning it... since a) I would totally forget about it 😛 , b) so that everyone has a clear idea of what is going on, c) to be able to link to all this in case sth like this comes up and not re-explain the whole thing all over again (which I will have forgotten myself by then, anyway... lol).

Needless to say: any ideas or brainstorming in that aspect are more than welcome! 🚀

arturo-lang / arturo

Functions & sloppy "inlining" #1626