rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.93k stars 12.53k forks source link

Tracking issue for channel selection #27800

Closed alexcrichton closed 5 years ago

alexcrichton commented 9 years ago

This is a tracking issue for the unstable mpsc_select feature in the standard library. Known points:

BurntSushi commented 6 years ago

@gterzian My suspicion is that folks will probably want to coalesce around crossbeam-channel. I'd say your first task would be convincing folks that crossbeam-channel is somehow insufficient or worse than a different implementation. I'd probably start by prototyping your desired API outside of std.

What is the story behind there not being a path towards stabilization?

Because folks have been concentrating their efforts on experimenting and standardizing on APIs outside of std since there is no compelling reason (as of yet) for these APIs to actually be in the standard library. (You might disagree with what qualifies as a "compelling reason," but I'm not trying to be prescriptive or opinionated here. I'm more or less trying to be descriptive based on what I perceive the consensus to be. Others may feel free to chime in on that point though!)

SimonSapin commented 6 years ago

The Stable Rust ecosystem now has a channel library that supports channel selection, with in my opinion a superior API. Why is std::sync::mpsc worth spending effort on? Why is it important to be in the standard library? There are many other crates that are important to the ecosystem (for some idea of "important") but yet are not part of the standard library.

ssokolow commented 6 years ago

My main concern is that having std::sync::mpsc in std to satisfy API stability requirements but without a corresponding stable select API in std will continue to cause misconceptions like RMS's.

(ie. People thinking "They clearly intend to have some kind of threaded equivalent to what POSIX offers with fork() and pipes, but the language is still immature enough that a corresponding select isn't offered.")

ghost commented 6 years ago

@gterzian

It isn't a mpmc or some other more complicated flavor of channel. To be convinced, just google "Go concurrency patterns".

Perhaps this is not needed in Servo, but a lot of people have been requesting support for multiple consumers. They typically just do Mutex<Receiver<T>>, which is unergonomic and slow.

Using a set of spsc channels, you can build your own spmc one(Incidentally, it seems that such spsc use of std::sync::mpsc is also the most efficient way to send messages these days).

Simulating a SPMC channel using SPSC channels is not easy (unless if you meant "broadcast" channels, akin to the bus crate). Which consumer should a message be sent to? Maybe a random one, but what if some consumers consume messages faster than others?

Also, take a look at the updated benchmark results here.

Is Select currently really that bad, or missing features?

crossbeam-channel's select macro has a safe interface, the macro produces nice error massages, can receive from a dynamic set of receivers, and is faster than mpsc::Select.

Who needs to select over a send, when it can never block?

Send can block in bounded channels (mpsc::sync_channel), so this is a nice to have feature.

It is playing a big part in making logic like this easier to understand.

Consider this part in Servo where a message is selected from five channels. The equivalent code using crossbeam-channel would be:

let mut event = select! {
    recv(self.port, msg) => FromScript(msg.unwrap()),
    recv(self.control_port, msg) => FromConstellation(msg.unwrap()),
    recv(self.timer_event_port, msg) => FromScheduler(msg.unwrap()),
    recv(self.image_cache_port, msg) => FromImageCache(msg.unwrap()),
    recv(self.devtools_chan.map(|_| &self.devtools_port), msg) => FromDevTools(msg.unwrap()),
};

This is arguably much nicer and contains no unsafe code. :)

And regarding Sender or even Receiver needing to be Sync, it appears to me that in fact almost any communication pattern with a std::sync::mpsc can be implemented without the need for those two to be Sync

Unfortunately, sometimes it's not so easy.

For example using a Receiver as an iterator, allowing a component to 'quit' simply when all it's senders have been dropped elsewhere. To see how hard that is to do with a mpmc channel, just read this blog.golang.org/pipelines (here is my version ).

Yes, that's very nice and crossbeam-channel can do this too. Probably the only reason why Go doesn't close channels automatically is because it lacks RAII. It doesn't have anything to do with channels being MPMC.

Couldn't we move toward a stable Select, as opposed to the direction of deprecation? I guess there are perhaps some problems with the implementation, and I'm basically arguing in long form that the API is just fine.

That's been the plan for the past several years, but basically no progress has been made so far. And the reason is that it'd be very difficult to tweak the current implementation to expose a satisfying Select interface (ideally without unsafe).

I my experience, Select has been reliable and useful. Were there specific problems that haven't been documented so far?

Tricky bugs are occasionally reported in std::sync::mpsc - here's one currently open issue. These kinds of bugs are typically very difficult to track down. One commenter said:

(And I can't help but note that it is not very confidence inspiring safety-wise for would-be newcomers to find crashes in the stdlib unfixed for a long time...)

Now, crossbeam-channel hasn't been battle-tested as much, but it's a clean slate implementation, has fewer unsafe blocks, and goes to great length to ensure correctness. We currently have these tests:

This currently amounts to 1200 tests total, and we're writing more.

So essentially, if Select were to ditch the unsafe add, and improve the wait api, it could be moved into stable?

Yes, probably. Care will have be taken not to hurt performance when removing unsafe functions.

What is the story behind there not being a path towards stabilization? It certainly doesn't seem to be due to a lack of brains in the community writing channel implementations...

It's just really difficult to design a nice, safe interface and make channels fast at the same time. It might be even more difficult now than is necessary due to certain (historical) design decisions in the internal implementation of std::sync::mpsc.

In order to stabilize Select, I think we'd have to put in so much work that it simply wouldn't be justified given the fact we have crates with alternatives.

ExpHP commented 6 years ago

Beyond any issues with Select (which I wouldn't know about), there are also issues with the macro.

It looks like match syntax, but:

But moreover there is the fundamental issue that this is unusually elaborate syntax for a std macro. It feels like something that should be in another crate.

RalfJung commented 6 years ago

My main concern is that having std::sync::mpsc in std to satisfy API stability requirements but without a corresponding stable select API in std will continue to cause misconceptions like RMS's.

The best solution to this problem is deprecation, isn't it?

BatmanAoD commented 6 years ago

@SimonSapin That makes more sense -- I was confused by the use of the word "never", and this issue was linked in a post I was reading (by an author who apparently had the same misunderstanding) as support for a claim that there probably wouldn't ever be a stable version of Select:

Note: the ‘Select’ feature is not stable, perhaps never will be, see https://github.com/rust-lang/rust/issues/27800.

https://medium.com/@polyglot_factotum/rust-concurrency-patterns-communicate-by-sharing-your-sender-11a496ce7791

ssokolow commented 6 years ago

@RalfJung I'm uncertain.

If I were to use something with multiple parties on either end, I'd probably use crossbeam-channel so I didn't have to hack something up for the SPMC direction but it'd still be nice to have a high-performance SPSC channel in std, especially when it's currently the fastest option in that use case.

Perhaps just deprecating the multi-consumer aspect of it and offering an un-deprecated version under a new name?

ghost commented 6 years ago

@ssokolow

so I didn't have to hack something up for the SPMC direction but it'd still be nice to have a high-performance SPSC channel in std, especially when it's currently the fastest option in that use case.

I've managed to close this performance gap in crossbeam-channel - see the newest benchmarks.

SimonSapin commented 6 years ago

@BatmanAoD Well, it is also a possible future that Select will be deprecated, eventually removed, and never replaced in the standard library.

@ssokolow Re fastest, I think that’s with an older version of crossbeam-channel. @stjepang has linked above to updated benchmarks here: https://github.com/crossbeam-rs/crossbeam-channel/tree/master/benchmarks#results

gterzian commented 6 years ago

@stjepang Thank you, finally someone who like to write as much as I do.

Simulating a SPMC channel using SPSC channels is not easy (unless if you meant "broadcast" channels, akin to the bus crate). Which consumer should a message be sent to? Maybe a random one, but what if some consumers consume messages faster than others?

I'm less concerned about underlying implementation details, vs conceptual behavior. So when using std::sync::mpsc, having a set of SPSC channels simulate an SPMC doesn't need to involve any sort of underlying implementation guarantees. I don't need a 'fair' SPMC or something like that, just iteratign over the set of Sender in some way and sending them work will do.

Perhaps this is not needed in Servo, but a lot of people have been requesting support for multiple consumers. They typically just do Mutex<Receiver>, which is unergonomic and slow.

Couldn't such a handrolled SPMC channel described above, be already a giant leap forward compared to sharing a Mutex<Receiver<T>> to each worker? Would it really take you that much further to use a dedicated flavor of channel?

And could it be better to use std::sync::mpsc in various ways, as opposed to always have to wonder 'do I need an <insert flavor of channel> here, which crate gives me the right implementation?'.

In some ways, that is the power of a standard library, being able to use some conceptual building blocks that fit most use cases, and who can be 'pretty well' optimized all-round.

This article articulates this idea better than me: https://medium.com/@chenglou/use-simple-data-structures-b081e5689f9


With regards to the issue with hyper, and as you've highlighted elsewhere, also in Servo with the WorkletExecutor.

The problem here is more subtle, and it goes to the heart of how channels are used and "abused"(or let's say just used outside of their quintessential use case).

The idea of channels is that you're going to move the Sender, or a clone thereof, to a dedicated thread using it to communicate with the receiving thread. That matches the idea of 'one component, one thread', and it's very much a Send(thread::spawn only requires Send) way of doing things, not Sync.

Now, if you're going to to have some kind of 'handler' to be run as a task on a threadpool, and shared as a reference among threads, you're going to end-up having to wrap a few things inside of it, including perhaps a Sender, wrapped inside locks. You're doing Sync type of stuff, which isn't the quintessential use case for channels.

It might be a good thing to explicitly have to wrap the Sender in a lock in such cases, and I'm surprised by the expectation that Sender needs to be Sync in order to avoid doing so.

If you're building a kind of runtime for user-code to run on a threadpool, you can also avoid complaints from users about such locks by providing your own channel.

For example in the hyper case, the simplest form could look like a struct HyperChan(Mutex<Sender<T>>) that would internally do a self.0.lock().unwrap().send(msg).

Et voila, no more complaints about the need for mutexes around senders in hyper handlers...


(On iterating on Reiceiver, and 'quitting' when senders are dropped) Probably the only reason why Go doesn't close channels automatically is because it lacks RAII. It doesn't have anything to do with channels being MPMC.

Glad to hear that.

This currently amounts to 1200 tests total, and we're writing more.

Much appreciated.

That's been the plan for the past several years, but basically no progress has been made so far. And the reason is that it'd be very difficult to tweak the current implementation to expose a satisfying Select interface (ideally without unsafe).

Ok, thanks for the info.

Tricky bugs are occasionally reported in std::sync::mpsc - here's one currently open issue.

I will check it out, thanks.

(On the select macro of crossbeam) This is arguably much nicer and contains no unsafe code. :)

Yes I agree, it certainly looks nicer.

Send can block in bounded channels (mpsc::sync_channel), so this is a nice to have feature.

Yes I understand, and what I meant is rather that just having a select over a set of receivers is "good enough", especially assuming the unbounded variant sees more usage.

Also, take a look at the updated benchmark results here.

Yes I've seen those, and was surprised that the winner seems to be std::sync::mpsc using Sender without clones(in spsc flavor), right?

I've managed to close this performance gap in crossbeam-channel - see the newest benchmarks.

Great.


All the efforts being made on crossbeam surely make it a very high-quality crate and something that could be used instead of std::sync::mpsc, but what is the end-goal here? Adding crossbeam to the standard library, or not have any implementation in the standard library? I come from the Python world so I tend to be biased towards putting a lot of value on stuff being in the standard library...

Besides this bias, a standard library results in 'shared vocabulary' across the language, which for the most basic building blocks is a good thing, even for a systems language like Rust.

std::sync::mpsc exists today mostly because it was inherited from once upon a time before Rust 1.0, when Cargo or the Send trait did not exist yet. When green threads and channels were considered a core part of Rust’s concurrency story and were both built into the language.

std::sync::mpsc doesn't imply any sort of runtime in the context of their use, they're a good companion to std::thread, and an alternative to the locking primitives found in std::sync for certain use cases. The current mpsc implementation might also be the most versatile flavor of channel, a good building block for various use cases, and therefore great as 'shared vocabulary' to be found in the standard lib.

However, losing Select would remove a key feature from it, which would indeed result in users migrating to a crate. We might as well remove the whole std::sync::mpsc at that point, and that would feel like making the messaging approach to concurrency to be a second-class citizen in Rust(why have mutex in the standard lib, and be adding Future to it, while removing a good and versatile channel implementation? I'm sure people could replace their use of mutex with more efficient and more ergonomic alternatives found in crates, yet removing it from the standard lib would also feel like a real loss to the entire language).

@BurntSushi I think I understand that you mean the library team prefers to not have implementations inside the stdlib anymore?

@SimonSapin has it already been considered to standardize a set of Traits via the standard library? Similar to what is being done with Future? I can imagine a Selectable trait(maybe also a Sender and Receiver)?

The benefit would obviously be that implementation details could change and evolve, yet users would be guaranteed a kind of stability of API across time and crate? Perhaps this could open up some opportunities for interop between various implementations?

ssokolow commented 6 years ago

@stjepang Nice! Thanks for that. :)

ghost commented 6 years ago

@gterzian

You're implying that std::sync::mpsc was intentionally designed the way it was because we really wanted that particular design. That's just not true.

std::sync::mpsc is a MPSC channel primarily because MPMC would require locks (that'd make it slow) or deferred memory reclamation (difficult problem - a viable solution appeared only a few years ago).

The reason why Sender doesn't implement Sync yet is because it's difficult (mainly because unbounded channels switch between three internal flavors: oneshot, spsc, and shared). We really want to make it implement Sync, just don't know how.

Interestingly, SyncSender does implement Sync, and that's only because it uses locks behind the scenes so that made it easy. Locks also make the channel very slow, and the reason why we use locks is because nobody has put in the effort of improving the implementation yet.

A lot of design choices behind std::sync::mpsc were the result of unfortunate technical difficulties so we had to make tradeoffs. The channels are not optimized as well as they could be and they don't fit all the use cases we want them to. See the discussion here for more information.

Besides this bias, a standard library results in 'shared vocabulary' across the language, which for the most basic building blocks is a good thing, even for a systems language like Rust.

I agree with this, but...

We already have an established 'shared vocabulary' across the ecosystem and that seems to be working very well: serde is for serialization, rayon is for data parallelism, tokio is for async IO, http is for HTTP, crossbeam is for concurrency, rand is for random number generation, etc.

Adding dependencies to Rust programs is easy. As soon as we get Rust 2018, adding a dependency will become as simple as cargo add crossbeam. That means there's not as strong motivation for moving stuff into the standard library as in other languages. The aforementioned crates are not exactly second-class citizens to Rust - they've been widely accepted as the standard set of basic tools.

gterzian commented 6 years ago

@stjepang Thanks, it's very interesting to read more about the background of std::sync::mpsc.

You're right that I don't actually know that much about the original thinking behind the design and the implementation travails of std::sync::mpsc, so I should rather explicitly say that I'm trying to document what I see as a form of customary usage, mainly from observations in Servo, which might have grown organically over time and not reflect the initial design considerations for channels.

This usage, as I see it, requires in the default case only Send channels, for example in

  1. the set of 'webpage event-loops' managed by Servo is represented by a HashMap<TopLevelBrowsingContextId, HashMap<Host, Weak<EventLoop>>>, where EventLoop is actually a wrapper around a IpcSender<ConstellationControlMsg>.
  2. The EmbedderProxy, again a wrapper around a Sender.
  3. In general a lot of the attributes on the constellation are a lot of plain Sender and Receiver(with ipc variants), highlighting that the constellation is indeed the "grand central dispatch" of Servo...

On the other hand, the ScriptThread has a few ScriptChan traits wrapped inside a Box, I think to support the various task_source.

And incidentally we can even find some Arc<Mutex<IpcSender>> like in the CSSErrorReporter

The whole gamut is really there, and I'd say it doesn't feel weird to see a Sender wrapped explicitly inside a Box or Arc<Mutex<>> for specific use cases.

In the place where all the various components are created on start-up, you can see a lot of channels being created and moved around, without any of them wrapped in an Arc.

From this usage I observe that the default use case for channels seems to be allowing fairly 'heavy-weight' components, living in their own thread or process, to communicate via message sending, and receiving those via some sort of run loop. This usage fits the Send-only aspect of Sender.

Occasionally, you will also see Sender passed around as a share reference across threads, usually inside a broader structure, and used as some sort of 'one-shot' signalling device as part what we could call a "cross-thread callback" of sorts. To support these use-cases, the Sender, or the structure holding one, will often have to be wrapped in an Arc<Mutex<>>. However this doesn't create expectations, to me at least, that Sender would need to be Sync by default. I personally like the explicit mutex for such specific use cases, making it clearer to a reader of the code what is going on.


We already have an established 'shared vocabulary' across the ecosystem and that seems to be working very well

Thanks for sharing that, I think I can now appreciate better that the standard library in Rust isn't as important as in some other languages, like Python, and that even if std::sync::mpsc were to be entirely deprecated, Rust would still have a vibrant 'shared vocabulary' covering many different approaches to concurrency.

I look forward to using crossbeam in Servo when that PR merges, and getting to know some of those different flavors of channels. Maybe there is something I can do to help fix those last remaining failing wpt tests...

SimonSapin commented 6 years ago

With https://github.com/servo/servo/pull/21325 merged (thanks @stjepang and @gterzian!), Servo does not use this feature anymore.

Do we know of other projects that do? Would it be useful to do a crater run (with something like renaming the feature flag) to find out?

@rust-lang/libs How do you feel we should proceed with this feature?

alexcrichton commented 6 years ago

FWIW https://github.com/rust-lang/rust/pull/54228 seems like a good way to proceed. I think if that's small enough we can probably just outright delete the feature, and if it's large we can do a round of messaging with a schedule to delete.

SimonSapin commented 6 years ago

I’m in favor of deleting eventually, but since that part of the code isn’t being actively worked on there isn’t much to gain by deleting sooner rather than later. We can keep it with a deprecation warning for a few release cycles, just in case there are users not covered by Crater to leave them some more time to migrate. (This migration took a while to land in Servo.)

I my opinion #54228 can help figure out if we want to keep delaying even the deprecation warning like we did so far.

SimonSapin commented 6 years ago

Crater has found three regressions. One spurious, one that I submitted a PR to fix, and one in a benchmark measuring specifically the performance of the select! macro: https://github.com/rust-lang/rust/pull/54228#issuecomment-421759312

So let’s formally propose:

@rfcbot fcp close

rfcbot commented 6 years ago

Team member @SimonSapin has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot commented 6 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot commented 5 years ago

The final comment period, with a disposition to close, as per the review above, is now complete.

SimonSapin commented 5 years ago

Next is landing a deprecation warning.

SimonSapin commented 5 years ago

Triage: this unstable feature was deprecated in 1.32, so it is fine to remove any time.

ghost commented 5 years ago

I published a blog post titled Proposal: New channels for Rust's standard library discussing what to do with channels in the standard library.

How should we proceed about resolving this issue?

Bug #39364 is still present, which @stepancheg attempted to fix in PR #42883 but got stuck waiting on removal of mpsc_select. Perhaps now is a good time to remove mpsc_select and revive that PR.

Another option is to switch the whole implementation to https://github.com/stjepang/new-channel. As a bonus, we get performance improvements, impl Sync for Sender, SyncSender::send_timeout, more tests, lower compilation time, and fewer unsafe blocks.

Some people have suggested we just deprecate the whole mpsc module and leave channels out of the standard library.

Finally, the last option is to write a RFC for std::sync::channel that would eventually replace std::sync::mpsc. This would also allow us to fix #42883 and bring all the other improvements to std::sync::mpsc in the meantime.

SimonSapin commented 5 years ago

Oh I didn’t realize there was already work blocked on the removal of selection. If you want to send a removal PR feel free to ping me for review. We’re already good to go process-wise.

Replacing the internals of std::sync::mpsc and/or deprecating it is also interesting, but not relevant to this issue. It should probably be discussed on internals.rlo and then turned into an RFC.

ghost commented 5 years ago

@stepancheg Would you be interested in giving PR #42883 another shot and removing selection along the way?

For followers of this thread, here's the the last status report on the bug fix.

theronic commented 5 years ago

What's the succession story for waiting on two channels and taking from the first one that yields a message?

SimonSapin commented 5 years ago

Consider using another library such as https://crates.io/crates/crossbeam-channel

havelund commented 5 years ago

How can one have channels and no selection between channels in the core language.

steveklabnik commented 5 years ago

Channels are not in the core language; they are in the standard library.

havelund commented 5 years ago

Sure, but still: being able to select which channel to read from seems to be a fundamental feature that should follow a message passing paradigm based on channels. I tried using mutexes and that was a disaster since they are not reentrant.

BurntSushi commented 5 years ago

@havelund As linked above, use a library: https://crates.io/crates/crossbeam-channel

havelund commented 5 years ago

Yes thanks. I found it. It seems to work fine.

thesoftwarephilosopher commented 4 years ago

For what it's worth, I ended up at this issue because I was trying to find out what happened to the select! macro in std::sync::mpsc which would be useful to me right about now. And only through this issue did I find out about crossbeam-channel which I'm switching to for its select!. Would be nice to have something like that in the standard library.

dralley commented 4 years ago

Is there still any discussion about deprecating std::sync::mpsc in favor of something based on crossbeam-channel? I haven't seen any discussion since just after the blog post came out.

SimonSapin commented 4 years ago

std::sync::mpsc (or at least its public API) is here to stay since it’s #[stable]. At most we could make its usage emit deprecation warnings, but what positive outcome would that achieve? That would be to balance against ecosystem churn.

jonhoo commented 4 years ago

@SimonSapin I read @dralley's proposal as replacing the internals, not the API, but could be wrong?

dralley commented 4 years ago

Not my proposal just to be clear, I'm just curious as to whether any action items had ever come out of that discussion.

acestronautical commented 4 years ago

As someone who came to love Go, coded in it professionally for almost a year, and subsequently fell out of love with it (generics, performance, package management, code gen hell), the lack of Select in the standard library in Rust is what is preventing me from adopting it.

If I need high performance and I'm going to have to rely on external code for concurrency support anyways, I might as well go with the path of least resistance and write in C.

BurntSushi commented 4 years ago

@Ace-Cassidy You just need to add crossbeam-channel = "0.4" to your Cargo.toml and you should be good to go.

ssokolow commented 4 years ago

@Ace-Cassidy Rust is designed to have a minimal standard library so that different components can evolve their APIs and bump their versions independently from each other.

You'll also need external crates (maintained by Rust team members) for things like random number generation (rand), regular expression support (regex), or pre-built bindings for the parts of the POSIX API that don't have equivalents on Win32 (libc).