fantasyland / fantasy-land

Specification for interoperability of common algebraic structures in JavaScript
MIT License
10.09k stars 374 forks source link

all lazy structures are not stack safe #212

Closed safareli closed 6 years ago

safareli commented 7 years ago

All lazy statures (IO, Future, State) have stack issue. general way of implementing operations on such structures is:

function Structure(f) {
  if (!(this instanceof Structure)) {
    return new Structure(f);
  }
  this.run = f
}

Structure.prototype.method = function (a){
  return new Structure(function(){
    // here we call `run` from a or something like that 
  })
}

All of lazy structures need extra stack frame on each map/ap/chain/..., as they are creating new functions which have use old computation or passed in function, requiring larger stack size, leading to stack overflow (when you run/execute/lower them)

  1. apply curried function f which is taking 10000 arguments:
...(T.of(9999).ap(T.of(10000).ap(T.of(f)))
  1. map on some structure 10000 times
T.of(1).map(id).map(id).map(id).map(id).map(id).....

btw there is no such issue with native Promise:

Array(40000)
  .fill(idx => a=>a + idx)
  .reduce(
    (p,f,idx) => p.then(f(idx)), 
    new Promise((res,rej) => { setTimeout(res, 1000, 1)})
  ).then(console.log.bind(console))// 799980001

General way to fix this would be rewrite such structures so that they builds up computation as data on map/ap/chain/...(like Free monad/applicative...) and executing run interpret it in a stack safe way. For example this way we can make function composition safe. Also related paper "Stack safety for free" by @paf31.

what others think on this issue?

SimonRichardson commented 7 years ago

This is why tail call optimisations are important, but last time I checked they had dropped it from the latest proposal.

safareli commented 7 years ago

"effected" libs

safareli commented 7 years ago

@SimonRichardson TCO is enabled in safari10 by default and in node 6+ using harmony flag.

for example this snippet in TCO enabled environments runs without any issues:

function countTo(n, acc) {
  'use strict';
  if(n === 0) {
    return acc;
  }
  return countTo(n - 1, acc + n);
}
countTo(10000, 0) // 50005000

// but this fails 
Array(100000).fill().map((_, idx) => x => x + idx).reduce((f, g) => x => f(g(x)))(1) // stack 💥

So how could TCO help in making for example IO stack safe?

function IO(f) {
  if (!(this instanceof IO)) {
    return new IO(f);
  }
  this.run = f
}

IO.of = (x) => IO(() => x)
IO.prototype.map = function(f) {
  return IO(() => f(this.run()));
};

Array(40000).fill(idx => a=>a + idx).reduce((io,f) => io.map(f), IO.of(1))// stack 💥
SimonRichardson commented 7 years ago

So how could TCO help in making for example IO stack safe?

That's the right question :thinking:

SimonRichardson commented 7 years ago

Just off the top of my head; if you went with something like freaky then you could make a trampolined interpreter. @DrBoolean might have some thoughts on this. The issue is knowing when a good time to recurse and therefore to avoid the stack :bomb:

DrBoolean commented 7 years ago

I'm currently studying recursion schemes with TCO and generators. I'll report back... FWIW @safareli's knowledge of stack safety blows mine away :)

SimonRichardson commented 7 years ago

Also you should check out what scalaz/cats do, they've have the same issues at some point. I remember @puffnfresh doing a commit on a new freeT and something was mentioned then. It was some time ago, but that's what I recollect.

safareli commented 7 years ago

As I understand in cats:

data State a = StateT Trampoline a
data Trampoline a =  Free Function0 a
data Function0 a = () => a

So Free monads could help here, but in case of applicative structures we can't do much, for now (would investigate this as well)

robotlolita commented 7 years ago

@SimonRichardson PTC is still in the spec, and VM writers are supposed to implement it. There's a proposal to require a token for calls that should be a tail call, but that hasn't left draft stage yet, and as I understand part of TC39 objects to that (in particular because you don't really have PTC at that point) — either way, we'd get some sort of tail calls though.

Tail calls solve the same problem trampolining does, they're just cheaper (you can translate them to a simple jump with the appropriate registers / argument stack updated), you just need to write your code such that all relevant calls would be in tail position (not the case with the Io example above — f(this.run()) doesn't have this.run() in tail position —, but Data.Task's current implementation pretty much translates things into CPS, as does fantasy-promises in this organisation, since you need to for asynchronous code anyway).

So:

var Io = (fork) => ({
  run: fork,
  of: (value) => Io(f => f(value)),
  chain: (f) => Io(g => fork(x => f(x).run(g))),
  map: (f) => Io(g => fork(x => g(f(x))))
});

// This:
io(f =>  f(1)).map(x => x + 1).map(x => x * 2).run(x => x);

// Translates into:
(h =>
  (g =>
    (f => f(1))
      (x => g(x + 1)) // first map
  )(y => h(y * 2)) // second map
)(x => x)

Because all of these calls are in tail position, the VM doesn't need to return a value to once it finishes evaluating the function, so it can translate it to something like:

JUMP a

a: -- f(1)
PUSH 1
JUMP b

b: -- g(x + 1)
PUSH 1
ADD
JUMP c

c: -- h(y * 2)
PUSH 2
MULTIPLY
JUMP d

d:  -- x
RET

Doing CPS translations manually to verify these is quite a pain, but you really only need to ensure that your call happens in tail position. If it does, the VM doesn't need to translate that into a CALL instruction, it can just use a JUMP, and then you don't have a stack problem.

The only problem with this so far is that not all VMs have implemented PTC :(

SimonRichardson commented 7 years ago

So it sounds either you live with the pain until PTC lands, or you implement what you want with a new structure (could be free or other ones).

robotlolita commented 7 years ago

(As a side-note: stack is only a problem here when you have a large chain of synchronous maps/chains. sequence(Task, Array(1000).build(Task.of))) is a problem, but sequence(Task, listOfAsyncTasks) is not. All asynchronous code gets added to the scheduler to run on a clean stack)

safareli commented 7 years ago

Here what I have done is made Func (function wrapper) chainRec and used it in Free to define Trampoline:

// data Func i o = Func (i -> o)
const Func = (run) => {
  run,
  chain(f) { return Func((a) => f(this.run(a)).run(a)) },
  map(f) { return this.chain((a) => Func.of(f(a))) },
  ap(a) { return this.chain((f) => a.map(f)) },
}

Func.of = (x) => Func(() => x)
Func.chainRec = (f, i) => Func((a) => {
  const chainRecNext = (value) => ({ isNext: true, value })
  const chainRecDone = (value) => ({ isNext: false, value })
  let step = chainRecNext(i)
  while (step.isNext) {
    step = f(chainRecNext, chainRecDone, step.value).run(a)
  }
  return step.value
})

// stack safe
Func.chainRec((next, done, v) => 
  v == 30000 ? Func.of(v).map(done) : Func.of(v+1).map(next),
  1
).run() // 3000

// Trampoline a = Free (() ->) a
Trampoline = {
  // as Free and Func are chainRec, this will work without stack issues
  run: (t) => t.foldFree(Func, Func).run(),
  // :: a ->  Trampoline a
  done: a => Free.of(a),
  // :: (() -> a) ->  Trampoline a
  delay: f => Free.liftF(f),
}

function loop(n) {
  function inner(i) {
    return i == n ?
      Trampoline.done(n) :
      // instead of suspend we can just `chain`
      Trampoline.delay(() => i + 1).chain(inner)
   }
   return Trampoline.run(inner(0));
}

console.log(loop(100000)); // 100000 🎉

I would investigate State = StateT Trampoline part.

I think @jdegoes can also give some suggestions on the topic.

jdegoes commented 7 years ago

StateT Trampoline is not stack-safe, though Trampoline (StateT f) is. Also you can modify StateT to be stack safe if you want, it becomes f (s -> f (s, a)).

safareli commented 7 years ago

Recently I have Implemented FreeApplicative (Par in https://github.com/safareli/free/pull/31) which has O(1) complexity on map/ap and O(n) on fold. But the most important part is that fold is stacksafe, which was not possible with other optimised implementations out there, as they use functions to build up a structure which requires extra stack on fold, leading to stack overflow. here the Free Applicative structure itself is not using any function, and the fold is written so that function's stack is not used. I think same technique could be applied to other "lazyish" structures to fix stack issue without need for TCO support in environment.

Avaq commented 6 years ago

It may be worth noting that Fluture hasn't been affected since its 6.x release. Since that means that providing a stack safe implementation can be up to the library author, I suppose this issue could be closed.

safareli commented 6 years ago

Yes we can close it, also recently implemented stack safe IO structure for purescript which could be easily converted to JS if someone wants to.