Closed domenic closed 8 years ago
Go's channels were mentioned as a modern incarnation of CSP: http://golang.org/doc/effective_go.html#channels
This is example I have implemented in JS: https://gist.github.com/Gozala/7242467
Have you tried to use the same interface for writing and reading? I.e. the constructors take only buffering parameters or strategy objects and pushing/getting-pulled/erroring by source/sink are done through the same interface as what BWS and BRS interface have now.
Yes. That's what the vision in my head is. And after talking to @Gozala I think that matches his thinking too.
Basically when you create a "Stream" you get a readable side and a writable side. Writing on the writable side would be equivalent to calling the [[push]]
method.
We'd still need to have a way to implement different buffering strategies. But that should be doable by creating an interface specifically for buffering handling similar to what WritableStream currently has. @Gozala has some good ideas here.
I did a brief survey of this. It seems to not be a great line of inquiry, although the impetus is in the right place. Let me explain.
Matches
start
and pull
constructor parameters, the push
, close
, error
parameters match fairly closely with WritableStream's write
, close
, abort
respectively.Mismatches
start
constructor parameter to ensure that they stay in a waiting state until any promise returned from that is complete. In contrast, this asynchronous setup phase isn't something you could extract by passing a writable stream to the readable stream constructor or vice-versa.ReadableStream
's pull
constructor parameter is called in reaction to specific events regarding the state of the internal stream, namely when the buffer is drained or the consumer calls wait()
. A writable stream that would be passed in to the readable stream's constructor has no way of receiving these notifications.WritableStream
's write
constructor parameter gets data "pushed" to it, along with the capabilities to indicate what happened with that data, via (data, done, error)
. I don't see a way to model that by passing in a readable stream, without much more awkardness, essentially forcing every writable stream creator to make a whole drain-then-wait loop inside a function that should (in my mind) just be concentrating on writing data to the underlying sink.ReadableStream
's cancel
constructor parameter, plus WritableStream
's close
and abort
parameters, are defining reactions, and need to be implemented in a source- or sink-specific way; their semantics cannot be subsumed by passing streams to each other.Resulting Thoughts
WritableStream
-like thing to ReadableStream
's start
and pull
parameters. Let's see where that leadspull({ write, close, abort })
instead of pull(push, close, error)
, or given that parameters are freely renamable, you could always call that pull(write, close, abort)
.{ write, close, abort }
would make me think it operates on some kind of writable stream. But there is no writable stream to be found---we're dealing with the readable stream itself.({ write, close, abort })
and (push, close, error)
is mostly superficial. But what about revealing the write side while keeping the read side? That means that in order to transfer data to the underlying sink, you need to do a read-drain-wait loop. Since everyone now needs to do this, we might as well take care of it for them, and abstract it into an easy utility function built into the constructor. Oh, but now we have the equivalent of WritableStream
's write(data, done, error)
constructor parameter. Hmm.This immediately raises a number of issues, e.g. how to you represent a read-only file? The usual way is to vend only the read capabilities from the object. But the best pattern we have for doing that in JS is the revealing constructor pattern. This seems likely to lead us right back to where we are now, in circles.
I agree that it seems essential to have read only and write only interfaces for the system edges use case.
As for the channel approach, you could have the reader and writer channels be separate.
function makeSocketStream(host, port) {
rawSocket = createRawSocketObject(host, port);
[readable, writable] = Channel();
rawSocket.ondata = chunk => {
writeable.push(chunk);
};
rawSocket.onend = writable.close;
rawSocket.onerror = writable.error;
return readable;
}
Heres an example from Ruby's IO pipe.
>> rd, wr = IO.pipe
=> [#<IO:fd 10>, #<IO:fd 11>]
>> wr.write "foo"
=> 3
>> rd.read_nonblock(10)
=> "foo"
Right, which gets us right back to the equivalent of the old promise "deferred" pattern, with no constructors in sight. Not so great, especially combined with the other drawbacks (e.g. the awkwardness of how you have to read from the read-side and then manually buffer until your underlying sink is able to accept data.)
whats wrong with the deferred pattern ?
var { input, output } = Channel()
seems reasonable.
I think there are tons of options. My favorite one is suggested by @Raynos above. I don't think analogy with deferred
pattern is quite relevant. This API is significantly different in both what it does and what it represents.
Alternatively channel could play role of pipe and also expose read / write ports as separate objects if desired:
var channels = new WeakMap()
function Port(channel) {
channels.set(this, channel)
}
Port.protototype.close = function() {
return channels.get(this).close()
}
function InputPort(channel) {
Port.call(this, channel)
}
InputPort.protototype = Object.create(Port.protototype)
InputPort.protototype.constructor = InputPort
InputPort.protototype.take = function() {
return channels.get(this).take()
}
function OutputPort(channel) {
Port.call(this, channel)
}
OutputPort.protototype = Object.create(Port.protototype)
OutputPort.protototype.constructor = InputPort
OutputPort.protototype.put = function(value) {
return channels.get(this).put(this, )
}
var inputs = new WeakMap()
var outputs = new WeakMap()
function Channel() {
// ....
}
Channel.protototype = {
constructor: Channel,
put: put,
take: take,
get input() {
if (!inputs.has(this))
inputs.set(this, new InputPort(this))
return inputs.get(this)
}
get output() {
if (!outputs.has(this))
outputs.set(this, new OutputPort(this))
return outputs.get(this)
}
}
I think @josh shows a good pattern. I don't care much if we use [readable, writable] = Channel()
or { readable, writable } = Channel()
.
I definitely think that we need a one-way Channel primitive. We might also want something which allows two-way communication, but let's do that on top of the one-way Channel.
Essentially we want the Channel to work as a queue. By default it's likely a queue that can only hold 1 value before it signals back pressure. I.e. as soon as it gets its first value it'll ask the writer to hold off on providing more data (though it'll still accept the data if written to of course).
But then we should allow passing in other buffering strategies to the Channel constructor. These strategies should have the ability to simply count the number of values held by the buffer, or count total number of bytes, or total .length or some such.
But then we should allow passing in other buffering strategies to the Channel constructor.
To be clear, approaches such as these would not use constructors.
I still haven't seen anyone address how awkard it would be to write code that puts data in the underlying sink. It largely defeats the purpose of a streaming abstraction if you have to do that yourself. It would be helpful for someone to illustrate how they imagine this example working?
To be clear, approaches such as these would not use constructors.
Let me expand on this. It reveals the fundamental problem with the deferred-esque pattern.
In the code
var { input, output } = Channel(); // probably more properly `channel()`
Channel
is not a constructor, but instead a factory function.
What are input
and output
? Well, they have methods, and we probably don't want copies of those methods on every instance of them, so they must be instances of some prototype, e.g. WritableStream.prototype
and ReadableStream.prototype
.
But how did they get created in the first place? The natural answer, given the prototypes in play, is via the constructors, var input = new WritableStream()
and var output = new ReadableStream()
. Furthermore, whoever constructed them must have access to their internals, since the person constructing them hooks up their relationship together. How did they get access to those internals? The two possible answers are: (a) "C++ browser magic," which is an answer we try to avoid these days (e.g. it makes our JS-hosted reference implementation impossible); and (b) via the revealing constructor pattern, or some variant of it.
So again, we come right back to our current design. After this circumlocation, we see that Channel
is actually a higher-level object than the ReadableStream
+ WritableStream
combination: it abstracts away the manner in which you connect those two constructors to each other in a particular case. In fact, the particular case Channel
embodies is a no-op transform stream---making Channel
just a subset of #20, which we've had planned for a while!
I still haven't seen anyone address how awkard it would be to write code that puts data in the underlying sink. It largely defeats the purpose of a streaming abstraction if you have to do that yourself. It would be helpful for someone to illustrate how they imagine this example working?
Have you looked at my fork of example.md ? I believe it illustrates same example. I do plan on changing few things though to better support sync read use case.
To be clear, approaches such as these would not use constructors.
Let me expand on this. It reveals the fundamental problem with the deferred-esque pattern.
In the code
var { input, output } = Channel(); // probably more properly
channel()
Channel is not a constructor, but instead a factory function.
I don't agree with this statement, if you take a look either at my reference implementation or my example in previous comment it's clearly not a factory.
What are input and output? Well, they have methods, and we probably don't want copies of those methods on every instance of them, so they must be instances of some prototype, e.g. WritableStream.prototype and ReadableStream.prototype.
My comment above used InputPort
and OutputPort
as prototypes for relavant ports, same is true for the reference implementation.
But how did they get created in the first place? The natural answer, given the prototypes in play, is via the constructors, var input = new WritableStream() and var output = new ReadableStream(). Furthermore, whoever constructed them must have access to their internals, since the person constructing them hooks up their relationship together. How did they get access to those internals? The two possible answers are: (a) "C++ browser magic," which is an answer we try to avoid these days (e.g. it makes our JS-hosted reference implementation impossible); and (b) via the revealing constructor pattern, or some variant of it.
I think you make it sound very complicated while it's not, all the input / output port needs is access to take / put queues and buffer. So anyone could create Input / Output ports.
Channel constructor just creates Input / Output ports that share same read / write to same buffer and queue dequeue operations into same queue.
There are many ways this can be expressed in JS and you can take a look at reference implementation for one example of this.
So again, we come right back to our current design. After this circumlocation, we see that Channel is actually a higher-level object than the ReadableStream + WritableStream combination: it abstracts away the manner in which you connect those two constructors to each other in a particular case. In fact, the particular case Channel embodies is a no-op transform stream---making Channel just a subset of #20, which we've had planned for a while!
The difference is that Channel takes care of state machine that is Readable / Writable streams currently force users to deal with. I do believe that put / take on the pipe is a lot simpler and easier to understand than multitude of private public APIs that streams currently impose.
I would also argue that research papers written back in 70s that is being adobted by new languages like go, rust, clojure is a good prove that this idea has something to it.
I do believe that put / take on the pipe is a lot simpler and easier to understand than multitude of private public APIs that streams currently impose.
This seems to me to indicate that it would be a useful API to wrap true ReadableStream and WritableStream instances, to provide something simpler for those that don't need the fine-grained control we have shown to be necessary for I/O in Node, and would prefer a strategy based on research papers.
and would prefer a strategy based on research papers.
This is based of CSP research paper that actually that has proves that this minimal API is enough to express all of that. Also as it's not based only of a paper, many modern languages adopted this channel interface.
Again me and @Raynos are working on providing examples of every single concern that may arise with such API, but in order to keep this constructive it would be useful to illustrate actual issues, saying that this is a factory pattern and is bad does not really helps.
As far as I understand Channel
replaces ReadableStream
and WritableStream
completely.
I will work in porting Stream examples to channels, especially the Writable ones.
The current thing has Readable and Writable streams, but their constructors take objects which have two more interfaces--- respectively for putting data into the readable, and using the data in the writable. Is it possible this could be reduced to only two?
I am told CSP channels (see #88) are basically this idea.
/cc @gozala @sicking