Two "streams" confusion

domenic commented 8 years ago

With this change the Encoding Standard will have two concepts of "stream": https://encoding.spec.whatwg.org/#concept-stream and a TransformStream.

I see three ways out of this:

Rename the Encoding Standard's concept to something like "token sequence" and update all existing uses.
Try to consolidate everything into the TransformStream model internally.
Create some underlying concept that both can share, such that TransformStream is one implementation of the concept and TextEncoder/TextDecoder are another.

I think I lean toward the "token sequence" approach. The way it is treated is pretty different from streams in general, e.g. allowing both prepending and appending, and using EOS instead of { value, done }, and using abstract single bytes and single code points instead of JS Uint8Arrays and strings (which can themselves represent many characters each). This fits with my preference expressed in https://github.com/whatwg/encoding/issues/72#issuecomment-251408394 to have a clean break between the "token sequence APIs" (TextEncoder/TextDecoder) and the TransformStream APIs (TextEncoder.stream/TextDecoder.stream).

@annevk should probably weigh in on this.

ricea commented 8 years ago

I've been thinking about this with no clear solution.

My feeling is that the encoder and decoder algorithms in the standard operate in a virtual machine that is independent from Javascript, and is concisely described in section 4. To introduce Javascript concepts at that level would increase the amount of knowledge needed to understand the algorithms.

"stream" is a nice short name for the thing which the algorithms operate on, but having two different concepts called a "stream" in the same standard is clearly not a good idea. "token sequence" works for me.

What I've tried to do so far is always refer to the Javascript things by their full names (ReadableStream, TransformStream), and reserve the lower case "stream" for the concept in the Encoding Standard. I'll continue to do this until we have some resolution on this issue.

annevk commented 8 years ago

I don't think I care strongly how we solve this. It's probably a good idea if those implementing encoder/decoder algorithms don't need to know about JavaScript (streams). Although on the other hand, in Fetch we now have the situation where that is the case. Maybe we should keep token sequences and JavaScript streams separate there too.

ricea commented 6 years ago

I don't know if anyone already suggested this, but "token queue" seems to fit the semantics pretty well. "read" is a pop operation, "push" is, well, a "push" operation, and "prepend" is an "unpop" operation.

annevk commented 6 years ago

Yeah. Streams is also more abstracted now right so you can use the underlying concepts without having to allocate JavaScript objects? But if that gets too complex I'm also happy with a queue or some such. I need to look into https://github.com/whatwg/html/issues/1077#issuecomment-214780498 again to figure out what kind of changes are required to adequately address that.

annevk commented 6 years ago

Filed https://github.com/whatwg/encoding/issues/128.

ricea / encoding-streams

Two "streams" confusion #1