anachronauts / jeff65

a compiler targeting the Commodore 64 with gold-syntax
GNU General Public License v3.0
6 stars 0 forks source link

Specify enhanced control structures #27

Open jdpage opened 6 years ago

jdpage commented 6 years ago

Specifically, this provides the following:

The redo statement is a little weird, and is a combination of a Scheme feature and a Perl5 feature. In Perl5, the redo statement is the missing partner to break and continue (or next in Perl), in that it allows you to rerun the current loop iteration without re-evaluating the condition. It's equivalent to something like:

while (x < 42) {
myloop:
    /* ... */
    goto myloop;
    /* ... */
}

which for us would be

while x < 42 do: myloop
  /* ... */
  redo myloop
  /* ... */
end

On the Scheme side, you can label your let forms. A Scheme let is actually equivalent to creating and immediately calling an anonymous closure, and labelling a let just gives it a name. You can then express tail-recursive looping constructs by calling the given name, and it lets you do tail-recursion without having to provide an additional helper function with an accumulator.

Our redo would combine those features, with the caveat that it must be tail-recursive (since regular recursion on the C64 would be a disaster). It's actually somewhat cheaper than a function call. In addition, the compiler can use it internally to express loops; it's essentially something that we'd be implementing anyway, and it's so useful that we might as well expose it to the user.

I think I'd like, if possible, to avoid having to introduce a goto statement into Gold-syntax, and redo is a step towards providing a facility which provides a lot of its power in a structured, safe fashion.

I'm not sure about the do: foo ... end syntax for naming blocks, since : normally means that we're specifying a type. It's a kind of binding, so it feels like = should be involved, so here's my two alternate syntaxes for labelled do, let-do, and while-do respectively. Note that bare do is dropped in favour of an empty let binding in both:

let loop = do
    /* ... */
end

let x: u8 = 7, loop = do
    /* ... */
end

while x < 7, loop = do
    /* ... */
end
let(loop) do
    /* ... */
end

let(loop) x: u8 = 7 do
    /* ... */
end

while(loop) x < 7 do
    /* ... */
end
coveralls commented 6 years ago

Pull Request Test Coverage Report for Build 154


Totals Coverage Status
Change from base Build 151: 0.0%
Covered Lines: 465
Relevant Lines: 743

💛 - Coveralls
woodrowbarlow commented 6 years ago

i am definitely on board with bare do blocks, named loops, redo statements.


for named loops, it feels a little awkward to have the name on the right side of the "do"... once i see "do", i expect the next thing to be a statement. and i think it's reasonable to expect that. i'm also worried that our use of colons is losing any sense of consistency. lemme think on that, but i'm not sure if i can do better than the options you've already suggested.


for multiple-binding let statements, i just feel that the explanation you provided in the spec document is muddled. you've tried to define a basic let statement, a multiple-binding let statement, and a let statement bound to a loop's scope all at once and i think it wasn't clear.

let acc: u16, k: u16 = 1, n

i think this means "assign the value 1 to acc and the value n to k. beyond, like, two bindings in this syntax I find it gets very difficult to read. i find myself literally putting two fingers on the screen and matching up "okay, this value gets assigned to this identifier, this value gets assigned to this one...".

also, is this allowed?

let x: u8, y: u8 = 0

does it mean both values get assigned a value of zero? or that y gets a value of zero and x is left uninitialized?

can a multiple-binding let statement contain bindings of multiple different types?


as for this concept of let statements being bound to a scope in a new way, i'm confused. it seems to be only useful for the redo statement.

my hesitation is mostly this:

let mut x: u8 = 5,
let mut y: i16 = 0,
while x > 0 do
    /* whatever */
end

these bindings are both scoped within the loop, but only because of a single comma! they certainly don't look like they belong to the inner loop. if i'm reading this, i might put a print(y) statement after the end for debug.

this is not the whitespace style you used in your document, but i think it is the natural syntax (newlines after each comma) especially because it will be common to have maybe 4 or so variables bound in this way.


final summary

if i've correctly understood your motivation for multiple-binding let statements, scoping a let statement to a loop in a new way, and allowing expressions in a redo statement, it's basically to support this tail recursion example you've provided:

fun factorial(n: u16) -> u16
    let acc: u16, k: u16 = 1, n do
        if k > 1 then
            redo (acc * k, k - 1)
        else
            return acc
        end
    end
end

however, if i'm correctly understanding what these new features are meant to do, here is the same code written without multiple-binding let statements, let statements scoped to a loop in a new way, or expressions in a redo statement.

fun factorial(n: u16) -> u16
    let mut acc: u16 = 1
    let mut k: u16 = n
    do
        if k > 1 then
            acc *= k
            k -= 1
            redo
        else
            return acc
        end
    end
end

your example saved a few keystrokes, but isn't more optimized than mine and i argue that it's less readable than mine for three reasons:

jdpage commented 6 years ago

for named loops, it feels a little awkward to have the name on the right side of the "do" ... i'm also worried that our use of colons is losing any sense of consistency.

I agree with this 100% -- colons should only be used for notating types. Of the options I've given, I like Alternative 1 best, with name = do. I don't really want to overload parentheses more than they're overloaded. (In fact, I'm tempted to suggest that we use [ ] for both function calls and indexing, but that might be a little too radical.)

for multiple-binding let statements ... the explanation ... is muddled.

It is. Maybe I should define the combined syntax, then provide examples of each sub-part?

let acc: u16, k: u16 = 1, n

i think this means "assign the value 1 to acc and the value n to k. beyond, like, two bindings in this syntax I find it gets very difficult to read.

Yeah. I think my thought process was that it allows for nice stuff later like a, b = b, a to swap variables, and allowing multiple return from functions, but to be honest if that's important I'd prefer to just allow both 'C' multiple assignment (i.e. let acc: u16 = 1, k: u16 = n) and the proposed syntax. The 'C'-style multiple assignment is way easier to read.

also, is this allowed? let x: u8, y: u8 = 0

My inclination is the say "no", but I guess allowing uninitialized declarations that can't be read from until they've been assigned could be useful. C# allows it, but Rust does it better by making control structures into expressions.

Maybe "no" for now, then relax it later if it seems useful?

can a multiple-binding let statement contain bindings of multiple different types?

Yes!

as for this concept of let statements being bound to a scope in a new way, i'm confused. it seems to be only useful for the redo statement.

That's true. If all you wanted to do was restrict the scope of some variables, you could just throw them in a bare do block.

my hesitation is mostly this:

let mut x: u8 = 5,
let mut y: i16 = 0,
while x > 0 do
    /* whatever */
end

these bindings are both scoped within the loop, but only because of a single comma!

More proof that my description is muddy, since the above code snippet is a syntax error the way I intended the spec to be read. It should be this instead (if we're allowing C-style multiple assignment as discussed above):

let mut x: u8 = 5, mut y: i16 = 0 do
    while x > 0 do
        /* whatever */
    end
end

If we're not allowing C-style multiple assignment (i.e. PR-as-reviewed), it would be one of the following:

let mut x: u8 = 5 do
    let mut y: i16 = 0 do
        while x > 0 do
            /* whatever */
        end
    end
end
let mut x: u8, mut y: i16 = 5, 0 do
    while x > 0 do
        /* whatever */
    end
end

if i've correctly understood your motivation for [above features], it's basically to support [tail recursion] ... here is the same code written without [above features] ... your example saved a few keystrokes, but isn't more optimized than mine and i argue that it's less readable than mine ...

My motivation is kind of mixed. On the one hand, the features given are really useful as internal compiler constructs anyway, and while one could do without them, they are handy if you want to implement a naturally-tail-recursive algorithm tail (which, admittedly, factorial is not). When it comes down to it, anything that you can express tail-recursively can be expressed iteratively as well -- this is the entire point of tail recursion.

I think that support for recursion is important. That's a hard sell on the C64, but I like to think that this captures a lot of the parts that one actually wants. If you have an algorithm that's naturally tail-recursive, you can express it. If you don't, then you can express it either tail recursively or iteratively.

the redo requires matching expression order with the identifiers and they aren't even on the same line

This criticism applies equally to function calls. At least the redo is probably on the same screen as the let.

jdpage commented 6 years ago

As a side note, if you REALLY wanted to implement factorial without recursion, a better way would be:

fun factorial(n: u16) -> u16
  let mut acc: u16 = 1
  for k: u16 in 2 to n + 1 do
    acc *= k
  end
  return acc
endfun
jdpage commented 6 years ago

Yet another alternative syntax for named loops, riffing on the idea that redo is similar to a tail-recursive function call.

let-bindings and for-loops fit really nicely into this syntax since they bind variables. while-loops are incredibly awkward.

/* Named let-binding with variables (name is 'foo') */
let foo(x: u8 = 7, y: i16 = 300) do
  redo foo
  redo foo(8, 9)
  break foo
  continue foo
end

/* Named do block */
let foo() do
  /* ... */
end

/* Named for loop */
for foo(x: u8) in 0 to 7 do
  /* ... */
end

/* Named while loop #1 (ugly) */
while foo() x < 7 do
  /* ... */
end

/* Named while loop #2 
 * (ugly and name position is inconsistent with above) */
while x < 7 foo() do
  /* ... */
end

/* Named while loop #3 (inconsistent syntax) */
while x < 7 let foo() do
  /* ... */
end

/* Named while loop #4 (weird but ok maybe?) */
let foo() while x < 7 do
  /* ... */
end

/* Named while loop #5 (impossible to parse context-free,
 * since you have to know whether 'foo' is bound) */
while foo(x < 7) do
  /* ... */
end
woodrowbarlow commented 6 years ago

what about this?

kinda similar to function declarations.

[scope <name>[(<binding>[, <binding>])]] [for|while <condition>] do [...] end

examples:

fun factorial(n: u16) -> u16
    scope inner(acc: u16 = 1, k: u16 = n) do
        if k > 1 then
            redo inner(acc * k, k - 1)
        else
            return acc
        end
    end
end

scope myloop(x: u8 = 15, y:i16 = -2) while x > 5 do
    if (x % 2 == 0) then
        redo myloop
    elseif (x % 5 == 0) then
        redo myloop(x, y*2)
    end
    x -= 1
end

scope iterator for x: u8 in 0 to 10 do
    /* ... */
end