j-woz / exm-issues

Automatically exported from code.google.com/p/exm-issues
0 stars 0 forks source link

Adding new loop construct to Swift #80

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently the loop constructs in Swift don't easily allow values to be passed 
from one iteration to another, except indirectly through arrays.  

What would be nice:
* A loop structure which is intuitive to people coming from imperative 
languages (e.g. a version of for, while)
* A loop structure that lets you pass 1,2,3,4, or more variables from the 
previous iteration to the next iteration

The most realistic implementation strategy is to translate the loop into a tail 
recursive function E.g.

(a, b) loop_body (x, y, z) {
  statement1 ..
  statement2 ...
  new_x = ...;
  new_y = ...;
  new_z = ...;
  if (condition) {
    a, b = loop_body(new_x, new_y, new_z);
  } else {
    a = ...;
    b = ...;
  }
}

I would say implementing a new loop construct is feasible if there is a way to 
translate it into a recursive function like the one above.

Original issue reported on code.google.com by tim.g.ar...@gmail.com on 8 Feb 2012 at 9:44

GoogleCodeExporter commented 9 years ago
My first idea is to have a c-style for loop (but with strong limitations about 
what is and isn't valid for e.g. the loop update)

int i;
for (i = 0; i < n; i ++) {
  trace(f(i));
}

We can rewrite the above in terms of write-one variables, provided that we 
create a new variable i for every iteration of the loop.

====>

for_loop_body(i, n) {
  integer_get val:i i
  integer_get val:n n
  if (val:i < val:n) {
    function_call f _t1 i
    trace _t1
    # below is loop update code
    plus new_i i
    for_loop_body(new_i, n);
  }
}

I can also see something like the below working, with prev and curr pointing to 
a different write-once variable every iteration.  

for (prev = initialCondition(); !hasConverged(curr); prev = curr ) {
  int curr = calcNext(prev);
}

Original comment by tim.g.ar...@gmail.com on 8 Feb 2012 at 10:04

GoogleCodeExporter commented 9 years ago
I've ended up implementing something very close to my previous suggestion: 
something that looks on the surface like a c-style for loop, but still has 
write-once variables

Original comment by tim.g.ar...@gmail.com on 5 Mar 2012 at 2:42

GoogleCodeExporter commented 9 years ago
Take a look at how it works, I'd be interested in suggestions for how to 
improve it.

One thing that I don't especially like now is how it forces you to put a lot of 
things at the top of the loop for more complex loops.  It is nice to have the 
c-style syntax for familiarity, but we could explore other options that would 
make it more readable and intuitive.  It might be nice to have a block at the 
bottom of the loop to express the variable updates that occur (given that 
logically they occur at the end of the loop body.  E.g.
for (i = 0; i < n) {

} [ i = i + 1 ]

Original comment by tim.g.ar...@gmail.com on 5 Mar 2012 at 3:43

GoogleCodeExporter commented 9 years ago
Will probably need this for SciColSim

Original comment by tim.g.ar...@gmail.com on 5 Mar 2012 at 3:48

GoogleCodeExporter commented 9 years ago
This looks great. The Tcl output is very good.  I would leave it as is for now.

For the future, my concern is that the i=i+1 expression is confusing
when our novice users will already be confused by write-once
semantics.

I suggest we consider a "natural" Swift loop and make other
"compatibility" forms derive from that:

int i = 0;
int N = 10;

loop {
  trace(f(i));
  continue is_even(i);
  break i >= 10;
  next i := i+1;
}

which is equivalent to the C:

for (int i=0, N=10; ! i >= 10; i++) {
  if (is_even(i))
    continue;
  trace(f(i));
}

I think this is compatible with the tail recursion pattern and would
be a good thing from which to derive the "compatibility" forms such as
while(){}, iterate i {}, etc.

Original comment by wozniak....@gmail.com on 5 Mar 2012 at 4:33

GoogleCodeExporter commented 9 years ago
Very exciting news.  I dont yet fully understand all the notations in the 
examples here, but I get the gist of it. (next i, integer_get - is the latter a 
turbine data access primitive?)

The idea of a canonical basic loop sounds good. is the "next" statement a 
general statement that can be inserted *anywhere* - ie in any {} block - which 
basically mutates a variable by creating a new instance of it and hiding all 
the old instances?

So in the loop above you could say next i = i + 1; next j = k * 20 + i;   Any 
restrictions on where "next" is placed? Does it create a new stack frame, or 
push a new instance of its target variable onto teh current stack frame?  Can 
you say:

   next i = i  + 1;
   next i = i * 20;
?

Original comment by mjwi...@gmail.com on 5 Mar 2012 at 6:31

GoogleCodeExporter commented 9 years ago
I want to add that while I think this is an excellent and long-needed addition 
to the Swift language (maybe should be in Swift 1.0 ???) - I also still believe 
that we would benefit from some kind of block (like {{  ... }} ) in which 
normal mutable variable semantics is possible.  Possible as leaf functions, but 
also as intermediate functions.

I know that Tim has voiced the view that this is bad, but I'd like to explore 
it further.  Maybe needs a separate issue thread here,

Original comment by mjwi...@gmail.com on 5 Mar 2012 at 6:35

GoogleCodeExporter commented 9 years ago
My concern is that it would basically be another programming language - even if 
the syntax is similar, the semantics of swift are really entirely dependent on 
everything being write-once and implicitly parallel.  Most of the work 
(semantic analysis, the front-end, optimizations, etc) on the compiler would 
have to be replicated for this serial language.  

The bigger problem though I think is the interoperation of the two languages: 
if we have the two nested within each other, then we probably have a quadratic 
number of interactions between language features, which we need to design, 
implement, test, and even then I feel like the way the two worlds interact is 
going to be counter-intuitive: we're really leaning on the write-once variables 
to give us referential transparency and to make the implicit parallelism work, 
so having mutable variables goes quite strongly against the grain.  I think it 
would be more productive to add features that interact nicely with existing 
language features.

I'm not sure if you have a use-case in mind where this would be really helpful: 
maybe I'd have a better idea of what you want to do if I saw a concrete example.

If we're concerned about performance, then I can definitely get better results 
faster by optimising existing Swift: it will be fairly straightforward in many 
cases to avoid using the rule engine for basic expression evaluation. If we're 
concerned about programmability, then I think its better to focus on making it 
easier to make parallel Swift more expressive.

Original comment by tim.g.ar...@gmail.com on 5 Mar 2012 at 8:44

GoogleCodeExporter commented 9 years ago
Re: Comment 6: Keyword 'next' would only be valid inside the loop{}.  The point 
of the keyword and := are to distinguish the "assign in next loop scope" from 
normal assign-once semantics.  'break' and 'continue' expressions would be 
evaluated before the loop body, 'next' assignments would be evaluated at the 
end of the loop body.  However, the ordering of these statements and other 
normal statements within the loop body would be irrelevant. 

Original comment by wozniak....@gmail.com on 5 Mar 2012 at 10:10

GoogleCodeExporter commented 9 years ago
I think I prefer having the expressions as part of the loop construct itself 
(ie. before or after the block) rather than as pseudo-statements somewhere in 
the body of the loop: I think it makes it much clearer that they're attached to 
the loop and logically part of the construct.  It also means a programmer knows 
where to look for if they want to see all of the loop conditions and loop 
variables.

It also makes things very non-orthogonal and probably makes things for 
confusing for programmers.  I can imagine a new programmer would see e.g. the 
next construct, and think that it looked pretty much like any other statement, 
so it seems like a logical move to write something like this:

loop {
  trace(f(i));
  continue is_even(i);
  break i >= 10;
  if (...) {
      next i := i+1;
  } else {
      next i := i+2;
  }
}

One other thing: I was originally going to support variables declared outside 
the loop as loop variables, e.g:
int i = 0;
for (; i < n; i = i + 1) {

}
// What should i be here?

But I'm coming to the conclusion that its a bad idea because the consequences 
get messy and counterintuitive for an imperative programmer: we're only 
changing i inside the loop body, not outside it, so the new values of i are 
only visible inside the loop body, contrary to the behaviour of the languages 
which this is trying to emulate.

Original comment by tim.g.ar...@gmail.com on 6 Mar 2012 at 2:41

GoogleCodeExporter commented 9 years ago
The proposed loop{} was just something to get the conversation started- I'm not 
insisting on it.  

However, I think using conditional next as you do looks good and is something 
we could conceivably support.  Of course, we could just reject that usage.  
Many keywords are only valid in certain contexts. 

You seem to be saying loop variables should be declared with the loop syntax.  
I guess that's why iterate{} does what it does.  We could do that as: 

loop (int i = 0, int j = 0) { ... } 

Only i, j would be allowed as targets of a next-assignment. 

Original comment by wozniak....@gmail.com on 6 Mar 2012 at 3:29

GoogleCodeExporter commented 9 years ago
After implementing the first version, I did come to the conclusion that a clear 
distinction between loop variables which are updated every iteration, and 
non-loop variables which are constant is a good idea.  Before I started doing 
it, I had the idea that we should support lots of additional options e.g. with 
the initial value of a variable coming from outside the loop.  But trying to 
support things like that seemed to result in lots of weird corner cases because 
the behavior of loop variables and normal variables is very different.  

It did occur to me that it probably would be a good thing if we had a way for 
the final value of a loop variable to be visible in the outer scope.  E.g. 
int i;
for (i = 0; i < 10; i = i + 1) {

}
trace("final value of i is: ", i);

This would be nice if we want e.g. to be able to get the final values of 
variables in a simulated annealing loop.

This wouldn't be too bad to implement in the compiler.  The below pattern 
though is problematic, even though it is equivalent to the above in e.g. c:
e.g. 
int i = 0;
for (; i < n; i = i + 1) {

}
trace(i); // should this be 0 or n? either alternative is surprising

Maybe instead of discussing in the abstract though we should come up with a set 
of use cases, actually write the loops for any proposed design and see which 
seems to be the best proposal?

Original comment by tim.g.ar...@gmail.com on 6 Mar 2012 at 6:21

GoogleCodeExporter commented 9 years ago
This has been working for a while now.

An example is:

int j;
for (int i = 0, j = 1; i < 10; i = i + 1, j = j * 2) {
  trace(j);
}
trace(j) // Prints 2^10

I think the semantics of the for loops are pretty much right as is, and this 
syntax 
is workable, so I'm going to call this fixed and any proposed changes to it can 
go in a new ticket. 

Some observations based on my experience working with the loop construct:
The syntax works great for simple loops, but gets confusing in two cases:
* If you're referencing variables defined in the loop body in the update rule 
(this is perfectly sensible semantically, but violates the invariant that is 
maintained elsewhere in the language, that variables should be defined before 
they are referenced.)  This tends to indicate that the variable updates should 
be at the bottom of the loop block.
* Since the only way to remap a variable for the next loop iteration is to have 
it as a loop variable, there are often many loop variables, and the current 
syntax doesn't work as well - it would be good if the updates could be 
specified more naturally as a multi-line block.

One other observation is that, while the loop construct is implemented as tail 
recursion, it is slightly less expressive, as we assume the tail call happens 
at the end of the loop body:
http://clojure.org/functional_programming#Functional%20Programming--Recursive%20
Loopingvariables

One practical result of this is that it is difficult to throttle loop 
iterations.

E.g. below we might want to wait for new_state to be closed before starting the 
next loop iteration (this actually occurred in SciColSim), but we can't 
directly express that without additional compiler support.
for(int i = 0, int state = ...; i < n; i = x, state = new_state) {
  do_something();
  new_state = ...;
}

With tail recursion (using a construct like Clojure's: 
http://clojure.org/functional_programming#Functional%20Programming--Recursive%20
Looping) this is natural:
tail_loop(int i = 0, int state = ...) {
  do_something();
  new_state = ...
  if(i + 1 < n) {
    wait(new_state) {
      recur(i + 1, new_state);
    }
  }
}
The tail recursive version also gives us a bit more control over which 
expressions are evaluated and when.

The tail recursion idea might not be quite right for Swift (or it might be!), 
but its worth evaluating how expressive any construct is relative to that.

Original comment by tim.g.ar...@gmail.com on 17 May 2012 at 2:59