littledan / iterator-generator-return

2 stars 0 forks source link

Confusion on forced check for generators #2

Open bmeck opened 8 years ago

bmeck commented 8 years ago

Most generators have a stack pointer vs instruction pointer difference. When you resume a generator, it jumps to 1 of 2 instruction pointers based upon the CompletionRecord types (normal, throw). In the same can be implemented for .return since the address is known (and does not change usually [unless you have nested finally, which it has to track the move just like normal]). Can you go into more detain on the forced check implementation requirements?

Since ret assembly instructions rip the new instruction/stack pointer off the stack, what is keeping the VM from doing that instead of a branch.

littledan commented 8 years ago

Oh, I see. This optimization sounds possible to implement to me. Still, it adds a lot of complexity. I don't think a separate IP is needed for throw because VMs have to be able to count on everything as throwing. You wouldn't want to write the return IP into the stack, as that would probably have as much overhead as doing the check, so you'd want to maintain an out-of-band table mapping normal/throw IPs to return IPs.

Could you tell me more about the use cases you have in mind for this resource management--do they yield frequently enough that the check would be an efficiency issue? The lack of use cases is the biggest reason for me to argue for removing the feature.

bmeck commented 8 years ago

The return IP is always on the stack, it doesn't need to be added, as for throwing, it can be implemented multiple ways so it may or may not require checking on return of a CompletionRecord. Having 2 records lets you jump out of multiple functions for 1 JMP at the cost of extra memory (multiple ways to impl this though so exact amount varies).

Unlike a throw, the IP on the stack that RET returns to must be on the stack (unless you do full inlining). I think it might be a little complex, definitely not a lot.

I am giving a talk at Node.js Interactive in December but the premise of using generators is pretty simple:

// have to have ref to self to get .return(), quirk
let job = taskUsingFooDB(args, _ => job.return());
runner(job);
function* taskUsingFooDB(args, abort) {
   try {
     // acquire tmp files, resource locks, resources not managed by JS
     // if one of them becomes invalid (not error) abort()
     getLock('foo.db').onInvalid(_ => abort);
     return yield* sideEffects();
   }
   finally {
      rollbackSideEffects();
      releaseResources();
   }
}

This gives us some interesting effects, if we finish sideEffects later calls to abort will have no effects. If we abort prior to finishing we can roll back side effects (as possible), and then release any resources still being held.

Notes:

  1. Some quirks exist regarding yielding during finally, but that is the same with throw during finally.
  2. You must call .return manually if you wish to abort, but you need a way to know when it is finished. This gets confusing since you don't have access to a return value and cannot differentiate normal completion with .return, but is possible if you add a variable to hold the return value.

This lets me keep my resource locking away from my logic which is very useful to me. I do not think browsers use much locking right now, however https://github.com/lars-t-hansen/ecmascript_sharedmem is coming...

bmeck commented 8 years ago

Also, my bias of usage is heavily towards Node.js since that is where resource locking is more important.