folktale / data.task

Migrating to https://github.com/origamitower/folktale
MIT License
425 stars 45 forks source link

Any example usage of `cleanup`? #35

Closed safareli closed 8 years ago

safareli commented 8 years ago

It will be nice to provide some example of using cleanup. I can't understand why do we need it?

rjmk commented 8 years ago

cleanup was introduced here, and I think it's usage is to cancel async actions

safareli commented 8 years ago

thanks for providing that commit but it is kinda too abstract. it will be nice to have some real example of its usage.

rjmk commented 8 years ago

@safareli Is something like this useful?

var Task = data.task
var timeout

var t = new Task(computation, cleanup)

function computation (rej, res) {
  timeout = setTimeout(function() { res('never') }, 1000) 
}

function cleanup () {
  clearTimeout(timeout)
}

var log = console.log.bind(console)

t.fork(log, log)
t.cleanup()

// still works with maps
var w = t.map(a => "hi " + a)

setTimeout(() => { w.fork(log, log); w.cleanup() }, 1100)
safareli commented 8 years ago

As in concat and ap cleanup is getting some value as argument it seems like we could use it like this.

var Task = require('data.task')

var t = new Task(function computation (rej, res) {
  return setTimeout(function() { res('never') }, 1000) 
}, function cleanup (timeoutId) {
  console.log('cleanup', timeoutId)
  clearTimeout(timeoutId)
})

var log = console.log.bind(console)

t.cleanup(t.fork(log, log))

// still works with maps
var w = t.map(a => "hi " + a)

setTimeout(() => { w.cleanup(w.fork(log, log)) }, 1100)

copy and run here


I'll try to add test for that so that others could see it in action.

rjmk commented 8 years ago

Ah, very nice! I would suggest saving the result of t.fork to a variable and then calling t.cleanup on it -- it's just more suggestive of how it might actually be used

robotlolita commented 8 years ago

Quoting from a previous discussion on Gitter:

[cleanup] allows the task to declare how to collect its resources if you want to cancel its effects.

Imagine you write something like: race(delay(10), timeout(20000)), the task on the left will always complete first, which means that the task on the right should fail after 10 milliseconds. In this case, since both delay and timeout create new timers in the process, it's important to remove those timers from the process and GC them, since otherwise a program like race(delay(10), timeout(2000)).fork(::console.log, ::console.log) would always run for 20 seconds.

In the current version of Task, this works in an awkward way (it's going to be fixed in the redesign, to work like Siren's Task https://github.com/siren-lang/siren/blob/master/runtime/src/Concurrency.siren):

  • A Task is an object of type: Task(α, β): new (computation: (reject: (α → Unit), resolve: (β → Unit)) → γ)[, cleanup: (γ → Unit)]

    That is, it takes two functions: "computation", which is a function that takes two functions as arguments (reject and resolve). These functions are invoked to provide a value for the completion of the task.

  • computation is expected to return a value, of type γ. This value is referred to as "resources", it contains a reference to the resources that were created by the computation (if any).
  • If the computation creates resources and returns those resources, you need to collect them. This is the role of cleanup. It takes whatever was returned by computation and destroys those resources (assuming this can be done). This could be a file handler, a timer (like in the case of delay and timeout there).

Given these, we can compose tasks with operations like race and know that our resources will be handled automatically and correctly for us (which is not something that can be done with Promises, for example). The awkward part is that the function running the Task (calling .fork) is the one that needs to deal with this right now, in the next version Task will handle this automatically as well.

The control.async module defines a few functions that make use of this, and the Task#concat method is basically race, so reading their source code might give you a more practical view of how this happens:

So you get:

const Task = require('data.task');

function delay(time) {
  return new Task((reject, resolve) => {
    return setTimeout(_ => resolve(), time)
  }, clearTimeout);
}

function timeout(time) {
  return new Task((reject, resolve) => {
    return setTimeout(_ => reject(), time)
  }, clearTimeout)
}

function noop() { }

function race(a, b) {
  const resourcesA = a.fork(cleanup, cleanup);
  const resourcesB = b.fork(cleanup, cleanup);

  function cleanup() {
    // Since this might run before we create all resources
    setTimeout(_ => {
      a.cleanup(resourcesA);
      b.cleanup(resourcesB);
    });
  }
}

Note that you shouldn't store the resources in a lexical variable outside of the Task you're creating (this is a bug in control.async rn), since running the task Twice, concurrently, would clobber the state and prevent some of the resources from being collected. Always return the resources from the Task's computation function instead:

function tempFile(data) {
  let filename;
  return new Task((reject, resolve) => {
    filename = generateTempFilename();
    writeFile(filename, data, (error) => {
      if (error)  reject(error);
      else        resolve(filename);
    });
  }, () => removeFile(filename));
}

function cleanup() {
  setTimeout(_ => {
    task.cleanup(resourceA);
    task.cleanup(resourceB);
  });
}

const task = tempFile(someData);
const resourceA = task.fork(cleanup, cleanup);
const resourceB = task.fork(cleanup, cleanup);
// Now the first temporary file is still in the file system
// because the `filename` for that particular execution of the
// task was lost.
safareli commented 8 years ago

Thanks that makes sense. I tried to read siren code but without any success :D What does plan for next version looks like?

robotlolita commented 8 years ago

Creating and transforming Tasks stays the same:

const Task = require('folktale').data.task;

function delay(time) {
  return new Task((reject, resolve) => {
    return setTimeout(_ => resolve(), time)
  }, clearTimeout);
}

const helloIn10 = delay(10).map(v => "Hello!");

But instead of .fork() for running the tasks, you have a .run() method. The .run() method returns a TaskExecution object, which lets you cancel the task, or get the value of the task:

const execution = helloIn10.run();

// Cancelling a particular execution of a Task:
execution.cancel();

// Getting the value of executing a Task:
// (Future is a Monad, like Task, but it doesn't run any computation, 
// it just eventually provides a value)
execution.future().cata({
  Cancelled: _ => ...,
  Resolved: _ => ...,
  Rejected: _ => ...
});

// You can also use a promise, which lets you use async/await:
const value = await execution.promise();

TaskExecution objects take care of memoisation, providing futures, cancelling tasks, and collecting resources when the task finishes running or is cancelled, so this should make it easier for people to use Task.

safareli commented 8 years ago

Thanks for explanation.