can iterators be implemented as a library solution / syntax sugar over regular procs?

timotheecour commented 3 years ago

see https://github.com/nim-lang/Nim/pull/15655 for a simple macro that allows defining something that acts very similar to an iterator:

it supports break, continue, multiple iterator arguments, can yield a single or tuple of arguments

Not only that but it solves major long existing limitations with iterators, eg:

closure iterators can't be used in VM
closure iterators can't be used in js
(EDIT) closure iterators can't be recursive
(EDIT) defer/exceptions don't work in closure iterators: https://github.com/nim-lang/Nim/issues/15501 https://github.com/nim-lang/Nim/issues/11815

this is not a complete 1:1 replacement (see limitations below) but could become one. This could potentially be a basis for replacing iterators with regular procs + syntax sugar, which would simplify nim language by removing all custom compiler code handling iterators.

example 1

import std/iterates

# analog to:
when false:
  iterator myIter[T](s: seq[T]): auto =
    for ai in s: yield ai*10

proc myIter[T](s: seq[T], cont: proc(_: T)) =
  for ai in s: cont ai*10

# analog to:
when false:
  iterator myIter2(a, b: int): auto =
    for ai in a..b:
      yield (ai, $ai)

proc myIter2(a, b: int, cont: proc(a1: int, a2: string)) =
  for ai in a..b:
    cont ai, $ai

template main() =
  block:
    var ret: seq[int]
    # analog to:
    # for x in myIter(@[2,3]):
    #  ret.add x
    for x in iterate myIter(@[2,3]):
      ret.add x
    doAssert ret == @[20, 30]

  block:
    var ret: seq[(int, string)]
    for k,v in iterate myIter2(2,5):
      ret.add (k,v)
    doAssert ret == @[(2, "2"), (3, "3"), (4, "4"), (5, "5")]

static: main()
main()

(EDIT) example 2: recursion

see example showing recursion works in https://forum.nim-lang.org/t/7020#44149

note

the following maps {.closure.} iterators only, but with alias PR (https://github.com/nim-lang/Nim/pull/11992), we should also be able to map {.inline.} iterators just as well (since we'd be able to forward a template (instead of a proc) inside the iterate macro); I can provide more details if this is unclear
D ranges is still the superior abstraction, offering better composition and support for 0-cost lazy functional programming (unlike sequtils + other nimble packages which have fundamental limitations); note that the iterator syntax sugar would still be usable for simple ranges.

limitations

this macro is not a complete replacement of iterators but could be improved to narrow the gap: these are not yet supported: finished, next.
type inference could be improved so that the macro would be able to infer the iterator yield type, this is IMO possible if https://github.com/nim-lang/Nim/pull/11754 is merged; this should enable this to work: typeof(block: (for ai in iterate a: ai)) which currently doesn't work; and hence would enable toSeq
syntax sugar could be added to close the gap with iterators so that for ai in iterate a could be written as for ai in a; likewise, iterator definition could be mapped to a proc as defined above.

EDIT

I've now also implemented continue and break, see PR

nc-x commented 3 years ago

Even if this works, I would prefer the core language to be implemented in the compiler and not through macros (for better compile times and sane error messages), at least unless and until the macros are mapped to compiler plugins which should improve the performance of macros significantly (see v2 milestone),

Araq commented 3 years ago

D ranges is still the superior abstraction, offering better composition and support for 0-cost lazy functional programming

It's still worse as there is no ownership system so you're left with a tracing GC or more-expensive-than-you-think-it-is reference counted slice type which diverges quite a bit from the naive (ptr, len) pair implementation. Plus writing the code required to support these ranges is much harder to write than using a yield statement.

mratsim commented 3 years ago

I'd like a report on CPS, what was tried, what is missing so far so that we can evaluate this RFC with its most obvious alternative (besides what is implemented today).

disruptek commented 3 years ago

The initial CPS impl in master represents an untyped macro. The tests and demos work.

The typed version (I think it's in a typed branch), solves several performance and ergonomic warts but doesn't work yet due to apparent compiler bugs, at least one of which requires a breaking fix. I think it's in 1.4, but I'm honestly unsure.

What remains to be implemented (or demonstrated, at least):

thread migration using isolate semantics
syntax polish so you can continuation() or whatever
rvalue shim so you can let foo = continuation[] or whatever
rewrites to support/optimize control-flow

Maybe @zevv has some clearer ideas about what's missing; this is just the stuff off the top of my head.

I personally really want properly bulletproof concepts so I can play with some ideas for CSP; the abstraction over CPS where we'll implement go-channel/clojure-core.async-type machinery.

zevv commented 3 years ago

Right, @disruptek kind of tells it all. The Nim CPS project is basically an implementation of the mechanics described in the paper "Continuation-Passing C - Compiling threads to events through continuations". Linear Nim code is transformed into a number of separate procs that pass continuation objects among them. What used to live on the stack is now in the continuations, which allows non-linear code flow.

On top of Disruptek's CPS work I prototyped some basic constructs like:

iterators (https://github.com/disruptek/cps/blob/master/stash/iterator.nim)
async I/O (https://github.com/disruptek/cps/blob/master/stash/standalone_tcp_server.nim)
coroutines (lost the code)
setjmp/longjmp (https://github.com/disruptek/cps/blob/master/stash/goto.nim)
exceptions (https://github.com/disruptek/cps/blob/master/stash/trycatch.nim)

From a technical point of view, this all works, although the syntax is sometimes lacking or plain weird, still.

We are stuck for some time on motivation and a compiler bug, although I'm not even sure at this time what exactly the bug was (something sym vs ident, I believe). I think the goal was to prove this was all feasible and usable, but we already know that a definitive implementation should probably do the transformations in the compiler instead of using macros.

timotheecour commented 3 years ago

Can you share any relevant link? EDIT: links were now added above

zevv commented 3 years ago

TBH: I'm not sure if the above links actually compile with the current state of CPS - stuff was very much in flux, we had a few days of hacking frenzy but broke API's on the way a few times. But I'm sure the above snippets show the gist of what we were trying to do.

disruptek commented 3 years ago

There are two forms of bug that I consider blockers:

proc arguments should be symbols for typed macros (fixed in 1.4?)
ICE on substitution symbol for symbol or symbol for expression in macros

timotheecour commented 3 years ago

in https://github.com/disruptek/cps/blob/master/stash/iterator.nim instead of:

var a = counter(3, 7)

# Resume the iterator a bunch of times

var v = a.produce()

echo "produced ", a.produce()
echo "produced ", a.produce()

can you use ForLoopStmt as I did in https://github.com/nim-lang/Nim/pull/15655 to enable the more idiomatic for syntax for the common case, which would then allow writing:
```
for v in cps counter(3, 7):
echo "produced ", v.produce()
```
when user doesn't need more fine grained iteration
[EDIT] can that code snippet be modified to support working with yield types other than int (and without having to specify the type excplicitly)? see again https://github.com/nim-lang/Nim/pull/15655 for how i did that

zevv commented 3 years ago

Hmm, it's all some time ago, so I'd have to get back into all that to tell, but I think with the for loop macro that should work...

Araq commented 3 years ago

This is not an RFC, this is an interesting question.

timotheecour commented 3 years ago

It's not just an "interesting question", the suggestion is that maybe the language can be simplified by mapping (via syntax sugar enabled by the compiler) the existing iterators to a pure library implementation, also fixing https://github.com/nim-lang/Nim/pull/15818 (vm + js not supported) and support for recursion (https://forum.nim-lang.org/t/7020#44149) in the process.

Araq commented 3 years ago

Sure but an RFC is what you write after you have established that it's possible and a good idea. Then you write an RFC to convince others that we should change Nim.

nim-lang / RFCs