Open ghost opened 6 years ago
Before I can come up with any useful opinion, I think I need to learn more of Rust's module systems.. I'd like to ask a Rust question. Say a user imported crossbeam_queue
and crossbeam_stack
, both of which should have Atomic
, Ptr
, ... in their own namespace. Can the Rust compiler can deduce that crossbeam_queue::Atomic
and crossbeam_stack::Atomic
be equal?
Both crossbeam-queue
and crossbeam-stack
would pull in crossbeam-epoch
as a dependency. If they pull in the same version of crossbeam-epoch
, then the Atomic
types are equal. Different versions of the same crate are like totally different crates.
If this sounds like the chances of having different versions of the same crate are too high, note that cargo helps a bit here. If you specify a dependency as crossbeam-epoch = "1.2.3"
, the version pulled in will be 1.2.X
where the X is highest possible.
For example, if crossbeam-queue
depends on crossbeam = "1.2.3"
, crossbeam-stack
depends on crossbeam = "1.2.6"
, and the newest 1.2.X
version is 1.2.7
, then they will both pull in version 1.2.7
and use the same Atomic
type.
But if one crate depends on 1.2.3
and the other on 1.3.5
, then they will use different Atomic
types.
sync::atomic
vs. atomic
: I prefer the former, following the design of std
. I don't see a strong reason not to follow it for the time being.
for crossbeam-utils
: what if we put everything in crossbeam::utils
? crossbeam::utils::scoped
and crossbeam::utils::cache_padded
.
I'd like to put a sold, one-size-fits-all ones in crossbeam
. For example, I believe we can come up with the implementation of the list, stack, queue, and deque that fit into 95% use cases, and I'd like to put them in crossbeam
. If we can come up with a solid hash table, which I doubt we can do in a near future, we can put it in crossbeam
, too.
I think the majority of users can use just crossbeam
. On the other hand, embedded systems or very low-level libraries, such as memory allocators, can use crossbeam-epoch
and crossbeam-X
.
For what it's worth, I think I prefer eg. crossbeam::Queue
over crossbeam::sync::Queue
(eventually crossbeam::collections::Queue
, but this might be too many modules?).
Crossbeam is all about sync, so I feel having a sync
module is kind of redundant, especially if collections and channels etc. is placed in the sync
module.
It would be my guess that most users will use the data structures, channels, etc. and I think we should optimize the module layout for that, which, in my head, means having them close to the root module.
When that's said, I'm a beginner to both concurrent programming and project planning, so what do I know :smile:
Hey there,
I think splitting this up is not a good idea.
We require crossbeam to eventually become system-level packaged in an RPM else we are unable to use it in our application. I think that splitting this up into many smaller pieces creates a complexity and a confusion about what pieces are needed, and a barrier for system-level packaging.
I think it's better to have a single, cohesive package of structures and components that are really well tested together, rather than many moving parts. Moving parts make it harder to contribute, understand and follow, whereas a single repository is a nice one-stop place for a contributor or user to go to, and then easy to distribute further.
I hope this helps,
@Firstyear
Crossbeam aims to be the equivalent of java.util.concurrent
written in Rust, more or less. As you can see, this Java package assembles a long list of semi-related data structures, and there is a lot of code in it.
We see Crossbeam not as one humongous crate, but instead as a project/organization that focuses on building a variety of tools (data structures, synchronization primitives, etc.) for concurrent/parallel programming. The contain-rs
projects is structured the same way.
The idea is to have a separate self-contained crate for each tool, or for each group of closely related tools. The most commonly used tools (scoped threads, epoch-based reclamation, channels, and probably a few others) will be collected together into the crossbeam
crate.
The whole crossbeam
crate will then consist of just:
extern crate crossbeam_epoch;
extern crate crossbeam_channel;
// ...
pub mod epoch {
pub use crossbeam_epoch::{pin, unprotected, Guard};
pub use crossbeam_epoch::{Collector, Handle};
pub use crossbeam_epoch::{Atomic, Owned, Ptr};
pub use crossbeam_epoch::CompareAndSetOrdering;
}
pub mod channel {
pub use crossbeam_channel::{bounded, unbounded};
pub use crossbeam_channel::{Sender, Receiver};
// ...
}
// ...
If you need a common data data structure, you can just use crossbeam
. But if you want something more exotic (a Bw-Tree or something like that), you'll have to reach for crossbeam-something-exotic
. Or, if you're building a memory allocator (see elfmalloc) and don't want to pull in the whole crate as a dependency, you can choose to depend on crossbeam-epoch
only.
We require crossbeam to eventually become system-level packaged in an RPM else we are unable to use it in our application. I think that splitting this up into many smaller pieces creates a complexity and a confusion about what pieces are needed, and a barrier for system-level packaging.
I don't know what is your RPM packaging process, but what exactly is the barrier for packaging the crossbeam
crate? Do you have to manually download all its dependencies and include it in the package? If so, is there not an automatic way to do that?
I think it's better to have a single, cohesive package of structures and components that are really well tested together, rather than many moving parts.
This is what crossbeam
will be. Currently, we still don't have that many moving parts, but next year we'll start going into different directions and have a ton of unrelated (mostly advanced or experimental) crossbeam-*
crates. Whenever such a small crate becomes stable enough, we'll bless it and include into the main crossbeam
crate.
Moving parts make it harder to contribute, understand and follow, whereas a single repository is a nice one-stop place for a contributor or user to go to, and then easy to distribute further.
This is a valid concern, but perhaps we can alleviate the problem by clearly explaining the overall structure of the project in the readme?
This is a valid concern, but perhaps we can alleviate the problem by clearly explaining the overall structure of the project in the readme?
In my opinion, it's quite hard to maintain multiple inter-related repos in GitHub, and in consequence, almost all "big" projects hosted in GitHub somehow invented a methodology to manage multiple repos [citation needed..?]. Writing guides in README.md
is obviously a good starter. We already have quite sophisticated project management systems, including the RFC process. Using these tools, I believe we will adequately manage the Crossbeam sub-projects.
On the other hand, I think Crossbeam will not be a "big" projects, e.g. consisting of million LOC, and one repo is just enough to host all the Crossbeam subprojects. Each of the monorepo's top-level directories may represent a crate, as done in https://github.com/redox-os/tfs . For this reason, I'm sympathetic to the concerns @Firstyear raised. But we already created several repos :) And I don't see a big benefit of removing all these repos and using a monorepo.
tl; dr: I agree with @stjepang. Let's use multiple repos.
@cuviper I was told you might be interested in this discussion.
Do you have an opinion on whether we should split Crossbeam into multiple smaller crates or have one large one?
In Fedora, we're packaging at the crate level, as published on crates.io, so having a shared repo or separate repos doesn't change anything. And we have about 200 crates packaged already, so I don't see that it makes much difference whether crossbeam is one crate or a handful. The thing that does cause headaches is if there are circular dependencies, which sometimes arise through dev/build deps -- please avoid this!
More generally, I have experience with num
split into multiple crates in a single repo. (rayon
too, to a lesser extent.) I know a lot of users jumped on that when it became available, especially to grab just num-traits
without worrying about the rest. I don't really know why that's more appealing vs. managing features though... 🤷♂️
I find it a little annoying to manage, but that may also be in part because I'm pretty much the only person maintaining it. If your project structure can better separate concerns, and especially if you have different people owning the different parts, then separate crates and repos makes a lot of sense to me.
So for clarity:
Thanks,
@Firstyear
crossbeam
crate.That second point is the important one I think. It really needs to expose all the required parts. Like I think it would be complex to have a crate for 'crossbeam' and 'crossbeam-extras' or something.
So long as it stays as "one crate" in the end, then I'm happy with this :)
However, if it's "one crate" then why do we need to split it up at all if it's "one project". Is there really a measurable benefit at that point?
Thank you!
However, if it's "one crate" then why do we need to split it up at all if it's "one project". Is there really a measurable benefit at that point?
crossbeam
as a huge dependency, but only want crossbeam-epoch
.crossbeam-epoch
be compatible with #![no_std]
, and separating it out makes that easier.crossbeam
. This is the same idea as behind Rayon's split into rayon
and rayon-core
, so that there is one default thread-pool across all versions of Rayon.crossbeam
.AtomicOption
and ArcCell
), scoped threads, epoch-based GC, HP-based GC in the future, channels, Chase-Lev deque, multiple kinds of queues in the future, etc.But this comes back to: Do you then have an rpm for crossbeam, and an rpm for crossbeam-epoch? do they become separate crates? If this happens it creates barriers to adoption and packaging.
Second, is it really worth micro optimising? We are not talking about a library with 100,000's of lines, but merely a few kb. In fact, it's about 44kb of code filesize, which means that for "output" to the compiled library, there will be only a few kb saving to "split" this.
Rather than becoming a series of "micro dependencies" like npm, (which is a fragile nightmare IMO), we should have a series of "robust modules", which do a collection of things well. If you want crossbeam epoch, you get crossbeam, and you deal with that.
Consider python - when you type "import os", so you can get may "os.path", you are pulling a reasonably sized dependency, but it's part of coherent well tested unit, that's easy to import and potentially redistribute.
I'm okay with the "many git repos under a single crate" idea, but I just don't want to see this become a mess of crates that people can't distribute in other formats (ie rpm).
I hope that helps,
@Firstyear Sorry to be contrary, but I really don't see why you think having many crates is an issue for rpm. Most crates already have many dependencies, so whether some of these happen to all come from the same crossbeam-rs
org is not really relevant.
If some other rpm package wants to use the main crossbeam
meta-crate, that's fine -- they'll Require: crate(crossbeam)
and all the sub-crates will get pulled in as transitive dependencies.
Rather than becoming a series of "micro dependencies" like npm, (which is a fragile nightmare IMO), we should have a series of "robust modules", which do a collection of things well.
This feels like you're railing against the crates.io ecosystem as a whole! For better or worse, such single-purpose crates are common. Hopefully it won't get leftpad
-bad, although some jokers do exist...
The main
crossbeam
crate is going to be an umbrella crate that brings together the most important pieces of the Crossbeam project together and reexports them. I've been thinking what should it look like. Here are some quick ideas...First,
crossbeam
depends oncrossbeam-epoch
and reexports the crate as:Then we have several atomic types, but I'm unsure if they should live in
sync::atomic
or justatomic
. The former is more consistent with the standard library, though.There's also a bunch of data structures:
Finally, some utilities:
But, instead of just shoving utilities into the crate root, we could organize them into submodules:
So the questions we need to answer are:
crossbeam
and what needs to be left outside? When should a Rust programmer reach forcrossbeam-X
instead ofcrossbeam
?std
or come up with our own?