nodejs / node-eps

Node.js Enhancement Proposals for discussion on future API additions/changes to Node core
443 stars 66 forks source link

EP-003: stream.Readable#readInto #27

Closed indutny closed 6 years ago

indutny commented 8 years ago

See: https://github.com/nodejs/node/pull/6923

indutny commented 8 years ago

cc @nodejs/collaborators @nodejs/streams

trevnorris commented 8 years ago

Totally dig the idea.

indutny commented 8 years ago

So, does it mean that I need to change "DRAFT" to "ACCEPTED"? What is required for this?

trevnorris commented 8 years ago

@indutny Well, if it were possibly considered controversial then it would need to be voted on by the CTC. I would at least like to see more of the CTC give their thoughts before it's ACCEPTED. Meaning, if an EP reaches ACCEPTED then the PR accompanying it will definitely be accepted, and there shouldn't be much if any discussion between CTC members about the PR landing (this doesn't include other devs who feel the need to chime in).

indutny commented 8 years ago

@trevnorris makes sense. Thank you!

indutny commented 8 years ago

cc @nodejs/ctc then

jasnell commented 8 years ago

Fwiw, I'm +1 on this. On Jun 1, 2016 8:38 PM, "Fedor Indutny" notifications@github.com wrote:

cc @nodejs/ctc https://github.com/orgs/nodejs/teams/ctc then

— You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/node-eps/pull/27#issuecomment-223187559, or mute the thread https://github.com/notifications/unsubscribe/AAa2efV1hdwFv4Ef403YFb3KvX3eppCPks5qHlAigaJpZM4IrtXl .

chrisdickinson commented 8 years ago

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

chrisdickinson commented 8 years ago

(Also: is it possible that we define this for a subset of streams — maybe just net.Socket?)

indutny commented 8 years ago

@chrisdickinson

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

readInto() call ._readInto if present, .read() calls it if ._read is not present. It does interact in quite the same way as the readable streams work right now.

indutny commented 8 years ago

@chrisdickinson

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

Personally, I don't have much preference, but it seems to be better to have a wider support for it.

benjamingr commented 8 years ago

This is something I would have used plenty of times (readInto). I'm +1 on this and it would make my life easier.

The proposal itself sounds reasonable to me. I'd make some stylistic changes to the code samples in the proposal but that's really not important.

mcollina commented 8 years ago

@indutny just have a section clarifying the relationship with pipe.

bnoordhuis commented 8 years ago

This proposal still has the issue that you need at least as many buffers as you have streams, doesn't it?

Ideally, it would be possible to have a single buffer and read into that on demand. E.g.:

const streams = [ /* ... */ ];
const buffer = new Buffer(256);

for (const stream of streams) {
  stream.on('readable', () => {
    const n = stream.readInto(buffer, 0, buffer.length);
    if (n < 0)
      handleError(stream, n);
    else
      handleData(stream, buffer, n);
  });
}
rvagg commented 8 years ago

It's such a leaky abstraction tho, I'm not really keen on increasing the complexity of streams tbh and this certainly does that. I'd prefer if this were something we could keep private in core for its own use until we're comfortable that the wins are big enough to share with users.

mcollina commented 8 years ago

@rvagg that will be hard, mainly because it will probably leak into readable-stream at the next release. My idea is to flag that as experimental, and if it is something too complex to maintain kill it in the next major.

My concern is mainly related to the current non applicability of this to pipe(), to get "global" benefits.

I'll be more than happy to "port" some of my MQTT stuff to readInto to check if I get benefits.

trevnorris commented 8 years ago

@rvagg What's unfortunate is that the idea is conceptually simple. Streams are what make everything else more complicated. I'd like to see this in core. Creating many buffers for incoming packets is the limiting throughput factor for TCP I/O.

@bnoordhuis That's sort of the API I was hoping for. Problem is the 'readable' event only fires once data has already been read in. So the Buffer that will hold the data needs to be supplied before the 'readable' event fires.

I see the flow going something like so:

Well, that's a rough outline of what I was thinking.

jorangreef commented 6 years ago

Any progress on this? This would be really awesome.

mscdex commented 6 years ago

I'm still very interested in this as well.

FWIW I just wrote my own implementation that is strictly limited to sockets and bypasses streams altogether to make things simple and straight-forward (although the 'end' event should still be emitted during socket close). This means that the socket is either always using a single Buffer during its lifetime or it's always in "malloc mode." A Buffer and a callback are both needed to activate the new behavior. I see almost 2x throughput in benchmark/net-s2c.js with this new mode.

It works like this:

const socket = net.connect({
  port: 12345,
  buffer: Buffer.alloc(65535),
  onread: (nread, buffer) => {
    // buffer argument === buffer property above

    // return `false` explicitly to stop reading, `socket.read(0)`
    // or similar should restart it
  }
});

Example "malloc mode" result:

net/net-s2c.js dur=5 recvbuf=0 type="buf" len=102400: 15.772152211480213

Example new mode result:

net/net-s2c.js dur=5 recvbuf=65535 type="buf" len=102400: 28.297843930773467
mcollina commented 6 years ago

I would love to have something like this implemented. I just do not see how we can achieve that with the current stream API. I'm open to suggestions.

This might be relevant to the discussion: https://github.com/mcollina/syncthrough. It's like Transform but sync (it respects the Streams 3 contract, apart from the buffering). Because it's sync, we can guarantee that the chunk has been processed afterwards.

mscdex commented 6 years ago

Perhaps we will have to just skip streams to make it simple and/or doable. I think starting with net.Socket will go a long way, as there are lots of existing modules that could benefit just from that (e.g. database drivers).

jorangreef commented 6 years ago

Thanks @mscdex, would you be able to provide me with a copy of that code (even very raw)? I don't mind doing more work to open source it.

mscdex commented 6 years ago

@jorangreef I just pushed what I currently have to https://github.com/mscdex/io.js/tree/net-socket-static-buffer. I'm sure some of the C++ changes could be designed better but meh...

Trott commented 6 years ago

Closing as this seems inactive. By all means, comment or re-open if this is still an ongoing concern, although I suspect it should be moved to a more active repository (probably the main node repo) if that's the case.