scylladb / seastar

High performance server-side application framework
http://seastar.io
Apache License 2.0
8.28k stars 1.54k forks source link

seastar pipe missing data #170

Open veteranlu opened 8 years ago

veteranlu commented 8 years ago

pipe implements by queue, it return a promise when the queue is full, if I push n(n>1) data to the fully queue, it just save the last one's promise, and the last one's promise would set value, the data we push to the queue except the last one would never notify any more, in other word, they are missing.

nyh commented 8 years ago

This is a correct observation. The pipe class internal documentation (in pipe.hh or doxygen) explains that:

The pipe is single-reader single-writer, meaning/ that until the future returned by read() is fulfilled, read() must not be called again (and same for write).

Basically, the purpose of the pipe is to introduce some sort of flow-control between two fibers (chains of continuation), one producing data and one consuming it. When the consumer can't keep up, and the pipe becomes full, we need the producer to stop producing, so we give the producer a non-ready future as the result of write(), and he is supposed to wait for it before initiating another write.

By the way, a related documented restriction is that:

The pipe reader and writer are movable, but not copyable.

So normally, you won't even have access to one pipe writer from multiple fibers.

It is interesting to consider whether we can relax the "single producer" assumption, and concede that even though we expect each producer to wait after blocking on a write, there may be several independent producers and we wish to allow all of them to block on a write() and wake up separately as space becomes available on the queue. Definitely doable. The biggest question is whether we can do this without losing performance or gaining too much complexity.

Another option is to keep a pipe as a single-producer feature (and uncopyable), and introduce a new feature (pipe writer multiplexor?) which multiplexes several pipe writers into one copyable pipe writer. This is similar to what we have done in shared_future, which multiple fibers can make a copy of and use.

There haven't been too many uses of seastar::pipe until now, so definitely we can still rethink its design and its purpose and what additional or changed abstractions can be useful for Seastar applications.

P.S. even in the current single-producer implementation, we can make it more helpful in helping the seastar::pipe user catch mistakes in the way it is used. For example, queue<T>::not_empty() and not_full() could assert (or throw an exception) if the promise is already set - instead of happily overwriting it with a new promise.