Consider reverting the merge of collections into alloc

brson commented 7 years ago

This PR merges the collections crate into the alloc crate, with the intent of enabling this PR.

Here are some reasons against the merge:

It is a violation of layering / seperation of responsibilities. There is no conceptual reason for collections and allocation to be in the same crate. It seems to have been done to solve a language limitation, for the enablement of a fairly marginal feature. The tradeoff does not seem worth it to me.
It forces any no_std projects that want allocation to also take collections with it. There are presumably use cases for wanting allocator support without the Rust collections design (we know the collections are insufficient for some use cases).
It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.

Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.

I am curious to know what embedded and os developers think of this merge cc @japaric @jackpot51 .

alexcrichton commented 7 years ago

@brson I'm not sure I agree with the reasons you laid out. In my mind liballoc provides the ability to depend on dynamic memory allocation, but nothing else. On top of that one assumption you can build up the entire crate (pointers + collections) with no more assumptions than libcore already has. In that sense I don't see this as a violation of layering but rather just giving you one layer and then everything else that it implies.

It forces any no_std projects that want allocation to also take collections with it.

Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?

It gives the std collections crate special capabilities that other collections may not implement themselves

Do you have an example of this?

japaric commented 7 years ago

I'm in favor of undoing the merge.

In general, I'd prefer not to reduce the number of crates in the std facade. I'm afraid that these seemingly "small" changes may eventually become hard, if not impossible, to undo once stable std API depends on these crates being merged (cf. HashMap confined to std even though it doesn't depend on OS mechanisms). And that may lock us from creating a set of stable fine grained standard / blessed crates for no-std development.

In the particular case of alloc / collections it feels odd to have a full suite of collections available to me when I'm implementing the Allocator trait. Things are likely to go wrong if I use those collections in the implementation of my allocator, right? I think allocators are not supposed to do any dynamic allocation for their internal data structures.

Personally, I'd prefer if alloc only contained the Allocator interface and Arc was in another crate since it depends on dynamic allocation.

cc @Ericson2314 @whitequark

alexcrichton commented 7 years ago

If we're going to separate this crate then I think we need to be super clear about why. For example the previous incarnation of alloc I don't think made any sense to keep collections separate because it already had smart pointers and such. If we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait. The purpose of alloc is to provide the assumption of a global allocator, notably the Heap type and value.

Things are likely to go wrong if I use those collections in the implementation of my allocator, right?

Yes, and to me this is not a reason to separate the crates. This is already a "problem" in the standard library where the "fix" is otherwise too unergonomic to work with.

est31 commented 7 years ago

It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.

If cargo had the ability to run with multiple different forks of std see this RFC, this would be much less of an issue.

brson commented 7 years ago

Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?

It is a problem. I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.

Do you have an example of this?

Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

For example the previous incarnation of alloc I don't think made any sense to keep collections separate because it already had smart pointers and such.

This was indeed a weakness of the previous alloc crate. I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.

If we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait

I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software.

RalfJung commented 7 years ago

I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.

So merging alloc and collections happened to be able to impl<T> From<Vec<T>> for Arc<[T]>. If instead Arc (and Rc) moved to collections, the impl would also be fine -- right?

sfackler commented 7 years ago

If the goal is to minimize library code bloat, should we move to libbtree_map, libbtree_set, libvec, libvec_deque, libbinary_heap, etc?

Ericson2314 commented 7 years ago

@brson thanks for reopening discussion, and @japaric thanks for the ping.

Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.

Exactly. I have a few reasons why I think combining facade crates relating to allocation is bad, but I want to start positively with why I think breaking facde crates further apart in general is good. Writing this with @joshlf's help.

Small, quickly compiled binaries are nice, but more important is the efficiency of the development process. We want a large, diverse, and flexible no-std ecosystem, because ultimately we are targeting a large diverse space of projects. Small libraries are key to this both because they reduce the cognitive load of everyone developing those libraries, and allow more developers to participate.

For cognitive load, it's important both to be utterly ignorant of the code that doesn't matter---as in, don't know what you don't know---and incrementally learn the code that does. --gc-sections or super lazy incremental compilation strip away the noise after the the fact, but do nothing to help the developer reading the code doing the same task. Small libraries help ensure as little dead code as possible, allowing the developer and tools alike to stay happily ignorant of everything else out there. For incremental learning, it's crucial to be able to digest the code one small bite at a time. Rust libraries allow all sorts of intra-crate (inter-module) cycles meaning that given a big huge unknown library, it can be really hard to know where to begin as first-time reader. But fine-grained dependency graphs on crates enforce acyclicity--both a top-down learning approach (start with the nice abstractions) or bottom-up one (complete understanding every step of the way) have clear curricula.

For distributed development and diverse goals, the benefits are probably more obvious. Lighter coupling means less coordination so more people can scratch their own itches in isolation. But it also allows complexity to be managed by splitting the ecosystem into small, modular components - this allows people to take only what they need, and thus only opt-in to as much complexity as their application requires.

All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far. It is my hope that as crates behind the facade and the language mature, more crates will be able to be written in stable (or otherwise trustworthy) code, and be moved outside rust-lang/rust into their own repo. Likewise, std should be able to (immediately and transitively) depend on third party crates just fine---the lockfile keeps this safe. Rustbuild, and my RFC #1133 are our friends here. To put it another way, there should be as little magic in rust-lang/rust as possible because magic cannot be replicated outside of rust-lang/rust. By splitting crates, we decrease the risk that a given piece of near-stable code will be bundled with code that cannot be stabilized, thus preventing the former from ever becoming stabilized and thus "de-magicked" in the sense of becoming eligible for being moved outside of rust-lang/rust. This runs counter to the arguments in favor of the collections/alloc merge.

In concrete terms, I recall there being talk of incorporating lazy_static into std, and I know there was talk of using parking_lot for locks too. lazy_static optionally uses spin and so should parking_lot, but spin probably shouldn't be exposed in std because it's a bad idea in hosted environments outside bootstrapping such core abstractions abstractions. Yet it would also weird if spin was included in the main repo as a dependency, but not exposed:I generally like the idea of shrinking, not enlarging the main repo, but would feel especially weird about including non-reexported functionality in the main repo.

Back to allocation in particular, @joshlf's been doing some great work with object allocation (type-parametric allocators that only allocate objects of a particular type and cache initialized objects for performance), and it would be nice to use the default collections with that: the fact is, most collections only need to allocate a few types of objects, and so could work with object allocators just fine. Now if we were to combine alloc and collections and the object allocator traits in one crate, that would be a very unwieldy crate playing not just double but triple duty.

if we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait.

Besides the technical reasons, as @japaric and I mentioned elsewhere, including anything allocation related in core, even something as harmless as a trait that need not be used, will scare away developers of real-time systems.

OTOH, I like the idea of allocator types being usable without Heap. Heap represents "ambient authority " for allocation, to use the capabilities lingo. As a general principle, ambient authority is suspicious, and in particular this complicates safely booting an OS where one doesn't want to allocate accidentally before any allocator is set up.

Also, there's an analogous problematic cycle to avoid when defining the static allocator even if it is initialized statically: For crates which implement global allocators it's incorrect for them to use Heap, yet so long as it's in the same crate as the alloc traits, there's nothing to prevent them from doing so. Technically the language still wouldn't stop them if it's in another crate, but at least they would have to import something suspicious explicitly. This is just like @japaric's concern with allocators accidentally using collections.

CC @eternaleye

joshlf commented 7 years ago

Given @Ericson2314 's comment, we (again, jointly) would like to make a proposal for how all of these various components might be structured.

We have the following goals:

The Alloc trait should have no dependencies other than core. Its dependencies include other items in its own crate such as collections or the Heap.
collections should have no dependencies other than core and the Alloc trait (e.g., it should not depend on the Heap); it should be possible for a no-std program to define its own allocator and use it to back the standard collections.
On the other hand, it is desirable for ergonomics to provide variants of each of the collections that default to using the Heap.

Thus, we propose the following:

The Alloc trait should either be in core or in its own crate (we'll call it alloc here). The former is simpler, but may be off-putting to embedded/real-time developers, and goes against the original philosophy of core - the bare minimum of nice abstractions to define the always-needed parts of the runtime, not every self-contained abstraction under the sun. The latter involves making a tiny crate whose sole purpose is to hold Alloc and related types (although it would become less tiny if an ObjectAlloc trait were added later). We lean towards a separate crate, but either works.
collections should be its own crate with dependencies only on core and alloc (if it exists). It should be no-std.
Heap can either be defined in its own crate with dependencies only on core and alloc (if it exists) or simply be defined in std.
std will re-export all of the collections with a default Alloc parameter of Heap. There may be other default values that std needs to set as well (e.g., the OS-dependent randomness provider for HashMap).

joshlf commented 7 years ago

@sfackler

If the goal is to minimize library code bloat, should we move to libbtree_map, libbtree_set, libvec, libvec_deque, libbinary_heap, etc?

I don't think that the arguments provided suggest that that should happen. Keeping all of the collections together doesn't increase the difficulty of understanding how things work because, for the most part, collections do not depend on one another, and do not share magic under the hood. From a usability perspective, they're logically related, so it would not be surprising to a developer to find them together in the same crate. The arguments we and others have presented do suggest splitting collections into its own thing - separate from, e.g., Alloc - but do not suggest further splitting collections into multiple crates.

sfackler commented 7 years ago

All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far.

@joshlf ^ seems to imply to me that it would suggest that level of granularity.

joshlf commented 7 years ago

Ah, we definitely didn't mean to imply that. @Ericson2314 can weigh in when he gets a chance, but speaking for myself, I would interpret that as "quite far within reason." I don't think that our reasoning provides a good argument for splitting collections up into separate crates, although maybe @Ericson2314 will disagree and will think we should even go that far.

Ericson2314 commented 7 years ago

Well I... wouldn't fight that level of granularity if many thers want it :).

Trying to think of a principle of why it's more important to separate alloc from collections than each collection from each other, I arrived at a sort of tick-tock model where one crate (tick) adds some new capability, and the next (tok) builds a bunch of with the capabilities added so far (it's "marginally pure"). Crates like alloc or a kernel bindings (e.g Libc without the other junk), would be be ticks, while collections and the old libsync tocks. I think it's more important that the ticks be minimal than the tocks.

murarth commented 7 years ago

The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

@brson: Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires Vec to be visible from the alloc crate, in order to implement From<Vec<T>> for Rc<[T]> and Arc<[T]>. Prior to the merge, this was not possible as it would have meant a cyclical dependency.

An out-of-tree collections crate would be able to make the same impls, as it could have both alloc and collections as dependencies.

joshlf commented 7 years ago

@murarth

Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires Vec to be visible from the alloc crate, in order to implement From<Vec<T>> for Rc<[T]> and Arc<[T]>. Prior to the merge, this was not possible as it would have meant a cyclical dependency.

Presumably the visibility requirement exists because Rc and Arc are defined in alloc? If that's the case, would moving them either to their own crate or into collections solve the problem?

murarth commented 7 years ago

@joshlf: Yes, that's correct.

joshlf commented 7 years ago

In that case, that'd be my suggestion - to either make a separate box crate (or similar name) or to move them into collections.

alexcrichton commented 7 years ago

@brson

I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.

I don't think I entirely agree with this. The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much. We absolutely rely on gc-sections for so many other features I think the ship has long since sailed on making an optional feature of the linker that we invoke.

I think it's also important to keep this all in perspective and extra concrete. On 2017-06-01 liballoc took 0.3s to compile in release mode and libcollections took 3s. Today (2017-07-11) liballoc takes 3.2s to compile in release mode. This is practically nothing compared to crates in the ecosystem.

Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

I think this is too quick an interpretation, though. As @murarth pointed out above we're not empowering std collections with extra abilities. Any collection outside std can have these impls.

I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software.

I think my main point is that we should not automatically go back to what we were doing before. I believe the separation before didn't make sense, and I believe the current separate makes a more sense. If there's a desire to separate the concept of allocation from the default collections that's fine by me, but I don't think we should blindly attempt to preserve what existed previously which wasn't really all that well thought out (the alloc crate looked basically exactly the same as when I first made it ever so long ago)

I'd personally find it quite useful if we stick to concrete suggestions here. The facade is full of subtle intricacies that make seemingly plausible proposals impossible to implement today and only marginally possible in the future.

One alternative is the title of this issue, "Consider reverting the merge of collections into alloc". I have previously stated why I disagree with this.

Another alternative by @joshlf above relies on defining collection types that don't know about Heap. This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

I also further more disagree with rationale that keeps Alloc out of libcore. There's no technical reason that I know of why it can't be in libcore and I'd personally find it a perfect fit for libcore. I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

joshlf commented 7 years ago

@alexcrichton

Another alternative by @joshlf above relies on defining collection types that don't know about Heap. This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections:

pub struct Vec<T, A: Alloc> { ... }

And then in std:

use collections::vec;
use heap::Heap;

pub type Vec<T, A: Alloc = Heap> = vec::Vec<T, A>;

I also further more disagree with rationale that keeps Alloc out of libcore. There's no technical reason that I know of why it can't be in libcore and I'd personally find it a perfect fit for libcore. I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

That's fine - as we mentioned, keeping Alloc out of core is merely a slight preference, and our proposal still works if Alloc is in core.

Ericson2314 commented 7 years ago

The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much.

no-std devs will be recompiling it.

3.2s to compile in release mode

How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?

As @murarth pointed out above

Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits. I'd say we do indeed have a problem and while moving all 3 of those to collections is a good step, there's still a problem because anyone else writing there own vec-like thing runs into the same issue.

I think my main point is that we should not automatically go back to what we were doing before.

I'd personally find it quite useful if we stick to concrete suggestions here.

I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either? Heap and the Alloc traits would stay in alloc, and then as a second step either the traits would move to core, or heap would move to its own crate.

This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

@joshlf lf beat me to using an alias (or if they fails newtype). CC @Gankro cause HashMap and the hasher is 100% analogous.

Ericson2314 commented 7 years ago

I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

So can I, but nobody was saying just put it in libstd! libcore <- liballoc <- {libcollections | liballoc-system} <- libstd: I can see any sub-graph (including libcore) being useful.

libcore alone: hard real-time
libcore, liballoc: soft real time---carefully denying ability to do dynamic allocation to time-critical sections.
libcore, liballoc, liballoc-global: Linux kernel module

alexcrichton commented 7 years ago

@joshlf

I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections:

Yes I understand, and I'm re-emphasizing that this does not work today. Whether it will ever work is still up in the air. As a result it's not a viable proposal at this time.

@Ericson2314

How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?

No, I highly doubt it. Everything in collections is generic which is pay-for-what-you-use.

Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits.

This is missing the point. @brson originally though that by moving Vec to liballoc we're empowering the Vec type with a capability not available to any other collection. This is not correct, the only problem was that the Box type could not reference the Vec name, due to how we organized the crates. This point has nothing to do with the allocator traits.

there's still a problem because anyone else writing there own vec-like thing runs into the same issue.

Can you articulate precisely what you think this problem is?

I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either?

I disagree with two things here. I don't think it makes sense to couple Heap and Alloc. The Alloc trait is a generic form of allocation, assuming nothing. The Heap type assumes there is a global allocator available and that's quite significant. The second part I disagree with is a "smaller alloc", which I think should be called core::heap. The core::heap module would not contain the Heap type.

@joshlf lf beat me to using an alias (or if they fails newtype). CC @Gankro cause HashMap and the hasher is 100% analogous.

Again, this is not possible. I was the one that put HashMap in std because it wouldn't fit in collections. We have not solved this problem, this is not a viable solution today. Whether it's a possibility in some future compiler needs to be explicitly accounted for as it means we cannot take action at this time.

whitequark commented 7 years ago

Let's consider this from another point of view. I see the current crate hierarchy as follows:

libcore: no allocation, no I/O
liballoc: infallible allocation, no I/O
libstd: infallible allocation, hosted I/O

As a no-std/embedded developer, I do not see any practical use in having what's currently in liballoc split into any number of crates. It is permeated by infallibility on every level, from Box to BTreeMap (and we don't even have HashMap on no-std in the first place...) If I have to replace parts of it, I'm going to have to start from Box, and go downhill from there. Otherwise, there's no reason to use only alloc but not collections; Vec is as fundamental as Box.

The savings in terms of code size do not exist because the majority of embedded software is compiled with LTO and at least opt-level 1 even for debugging; without LTO, libcore alone would blow through the entire storage budget, and without opt-level 1, the generated code is all of: too large, too slow, and too hard to read at that special moment you have to read assembly listings.

It seems straightforward and obviously good to put the Alloc trait into libcore, since there is no cost for doing so and then it can become a common interface for any allocators people implement.

joshlf commented 7 years ago

It seems straightforward and obviously good to put the Alloc trait into libcore, since there is no cost for doing so and then it can become a common interface for any allocators people implement.

If you, @whitequark, as an embedded developer, don't mind putting the trait in libcore, then I think your opinion overrides mine and @Ericson2314's since we aren't embedded devs :)

whitequark commented 7 years ago

@joshlf Mostly it's that I don't really understand the argument for keeping it out. The argument for having separate libcore and liballoc goes: libcore's only dependency in terms of linking is libcompiler_builtins, and it introduces no global entities, whereas liballoc introduces linking dependencies to the __rust_allocate, etc, symbols, as well as such global entities as an OOM handler. This presents clear practical argument for keeping them separate.

A trait that isn't used has no cost.

Ericson2314 commented 7 years ago

@alexcrichton

Everything in collections is generic which is pay-for-what-you-use.

Right; my apologies; I forgot about that.

Can you articulate precisely what you think this problem is?

I thought it was a coherence issue. If it's a name-reachability issue with Box, that still counts: the ability to be named in the implementation of Box is a special ability only available to types in the same crate as Box or an upstream crate.

We have not solved this problem, this is not a viable solution today.

Ah there is some naming confusing here because @joshlf used a type alias. A wrapper struct is what I consider a newtype, and that would work, right?. Wrapping every inherent method or using a one-off trait is annoying, but works.

Ericson2314 commented 7 years ago

@whitequark

Yes a unused trait is harmless re what code does and what the final binary looks like. But

I want to be able to blacklist liballoc, and then be sure that no library I link is trying to use the trait behind my back (unlikely but....).
I don't expect real-time devs coming from C to immediately grasp the harmlessness of an unused trait.
Finally, I think everything in core should either be a lang item, or be used (immediately or transitively) by a lang item in core----adding other stuff is just making libcore needlessly big even if those items are harmless when ignored.

Ericson2314 commented 7 years ago

@whitequark I hope that teaching collections to work with an infallible lobal allocator will be far algorithmicly easier than teaching them to work with a fallible (global or local) allocator.

alexcrichton commented 7 years ago

@Ericson2314

A wrapper struct is what I consider a newtype, and that would work, right?

In a technical sense, yes. In a more practial sense we've discussed doing this in the past and have always decided against it because then std::vec::Vec and a hypothetical collections::vec::Vec wouldn't unify, bifrucating the ecosystem.

Yes a unused trait is harmless re what code does and what the final binary looks like. But

I personally do not view it useful to belabor this point about balking at the mere presence of allocation. Do you really think that someone would shun Rust immediately because the Alloc trait exists in core but they would not shun Rust because of:

128-bit integers
float support
atomics
utf-8 decoding/parsing
fallible core::cell types
panicking support

Are any of those really that "less offensive" than an allocator trait? If so, why? I don't think you're designing for the right audience.

whitequark commented 7 years ago

@Ericson2314

I don't expect real-time devs coming from C to immediately grasp the harmlessness of an unused trait.

I think you should give more credit to embedded developers. Aggressive dead code elimination is a baseline requirement in this space. And many vendor board support packages do come with some sort of allocator (usually a bad one). It's not like we're frightened by the word "alloc".

I hope that teaching collections to work with an infallible lobal allocator will be far algorithmicly easier than teaching them to work with a fallible (global or local) allocator.

This makes no real difference to embedded development. For simple, non-mission-critcal things using a single common heap is usually fine enough, and for more serious applications infallible allocators aren't cutting it. For that matter, any mature library would have to consider allocation failures, and yes, it means that right now it's nearly impossible to implement a mature Rust library that uses dynamic allocation.

@alexcrichton

Let's go through this list.

float support

Actually, higher-end microcontrollers come with floating-point support; e.g. the Cortex-M4F series. We're using one right now in ionpak to implement a PID controller, and we similarly use soft-float in ARTIQ. There's nothing thorny about libcore's floats, not even the fact that you cannot turn them off; the marginal usefulness of getting a slightly more intelligible error is well offset by all downstream code being infected by #[cfg(float)] or something.
atomics

On small single-core systems the largest supported atomic size is also the largest register size, so atomic memory accesses lower into normal ones. On symmetric or even asymmetric multiprocessor systems, assuming the microarchitecture includes cache coherency and the silicon vendor didn't screw it up (lucky!), atomics are absolutely necessary.
utf-8 decoding/parsing

This is a perfectly reasonable library routine that would be stripped by DCE.
fallible core::cell types

This is actually a serious issue. For example, the soft-CPU we use in ARTIQ doesn't have a debug port; even if it had, the amount of legwork required to get JTAG working with a soft-CPU inside an FPGA is inordinate; and even if I got that working, or used a gdb stub (which I also tried), gdb support for the architecture is broken beyond belief and I could never get it to do anything. There's no lldb support. And I had to patch binutils just to get everything to link, because it resolved a relocation wrong. Good times...

You say "what a horror", for me it's Tuesday. I ended up writing a borrow_mut! macro that embeds the file:line into every site where RefCell::borrow_mut() is called, and relying more on Cell as well as other ways of managing state.
panicking support

There is no problem with panicking per se. It's just that every embedded dev out there permanently sets panic = abort and forgets about it. Even if I was nuts enough to implement stack unwinding for an embedded platform (in fact, I did, but that was for other purposes), it's far too slow and finicky to be used as a mechanism of error recovery. You need caches for the DWARF walking code to become fast, and enabling the unwinder caches makes your timings unpredictable. It's better to let the watchdog reset the entire thing, just to be sure.
128-bit integers

Oh boy this one is bad. It's particularly contentious because right now it means Rust cannot be used on a large number of architectures with less-tested LLVM backends that break on sight of an arithmetic op on i128, and eager codegen for libcore is basically stress-testing the backend. At M-Labs this means we cannot upgrade the Rust compiler used in ARTIQ for the time being, for example.

alexcrichton commented 7 years ago

@whitequark oh I think we're in agreement, what you mentioned above mirrors what I'm thinking. @Ericson2314 seems to be claiming that allocation scares away embedded devs, so I was pointing out a number of other features already in libcore that could hypothetically be construed as scaring away embedded devs. I personally believe that none of those is an impediment to embedded development in Rust.

joshlf commented 7 years ago

OK, so it seems like here's where we are:

We agree that the Alloc trait (and friends) can go in core
We haven't yet agreed on whether collections should have a dependency on Heap

Does that sound accurate?

alexcrichton commented 7 years ago

@joshlf on the first point yes, on the second point I believe we don't have an option. We can't have a type alias which adds a default (not implemented language feature) and having a wrapper struct introduces two types that don't unify. At this point in time I think the only answer is "yes, Vec must mention Heap"

whitequark commented 7 years ago

We haven't yet agreed on whether collections should have a dependency on Heap

Speaking of this, why can't the libcollections collections have an alternative constructor that takes a &Alloc? This gives us infallible local allocators without the backwards compatibility nightmare. Sure it is one more indirection, but is it really that expensive?

joshlf commented 7 years ago

@alexcrichton

We can't have a type alias which adds a default (not implemented language feature) and having a wrapper struct introduces two types that don't unify. At this point in time I think the only answer is "yes, Vec must mention Heap"

I see the reasoning here, but I worry that if we do that, then we are hamstringing ourselves moving forwards. In particular, if collections have a default Alloc parameter of Heap, and then in the future we gain the ability to have the type alias trick actually work, we won't be able to drop the default parameter without breaking code that relies on it.

@whitequark

Speaking of this, why can't the libcollections collections have an alternative constructor that takes a &Alloc? This gives us infallible local allocators without the backwards compatibility nightmare. Sure it is one more indirection, but is it really that expensive?

I'm not sure I follow - what problem is this trying to solve, and how?

whitequark commented 7 years ago

I'm not sure I follow - what problem is this trying to solve, and how?

Nevermind, what I thought of wouldn't work anyway.

Ericson2314 commented 7 years ago

@alexcrichton

In a more practial sense we've discussed doing this in the past and have always decided against it because then std::vec::Vec and a hypothetical collections::vec::Vec wouldn't unify, bifrucating the ecosystem.

Well, if it's a temporary measure (resolved before crates are stable), then I think that's perfectly fine.

Also, I forget about doing in std type Vec<T> = Vec<T, GlobalAlloc>;. It might adversely impact type inference, but otherwise should be fine, right? Not exposing any custom allocator at all to std is not a regression.

We could just wait and hope the chalk work injects some new energy into this, but I rather get an earlier start---working cleaning up the crates behind the facade has long been stalled on things like this.

Do you really think that someone would shun Rust immediately because the Alloc trait exists in core but they would not shun Rust because of: ...

I would like to put those behind feature gates, but fair point Alloc and friends could be behind one in core too. The important thing is a crate can declare which of these it uses, and be held to that standard.

joshlf commented 7 years ago

Also, I forget about doing in std type Vec<T> = Vec<T, GlobalAlloc>;. It might adversely impact type inference, but otherwise should be fine, right? Not exposing any custom allocator at all to std is not a regression.

I totally hadn't thought of that! That works. Also, I don't think it affects type inference. At the very least, this toy example works fine: https://is.gd/2KN7aC

Anyways, given this realization, I put my support firmly in this camp. I think this addresses all of our (my and @Ericson2314's) original concerns, and addresses (or comes close to addressing) @alexcrichton's concerns about re-exporting in std.

I would like to put those behind feature gates, but fair point Alloc and friends could be behind one in core too. The important thing is a crate can declare which of these it uses, and be held to that standard.

Are you suggesting a separate feature gate beyond the one we're already using for the alloc API (allocator_api)?

sfackler commented 7 years ago

It would be pretty bizarre to have to pull in libcollections and explicitly import Vec from there whenever you wanted to switch out the allocator.

whitequark commented 7 years ago

The important thing is a crate can declare which of these it uses, and be held to that standard.

This doesn't seem useful to me. A crate can allocate memory without ever touching liballoc or libstd or even libcore. It could call out to mmap directly or drag in its own allocator, however that works.

Ericson2314 commented 7 years ago

@sfackler it's a stop gap. If we officially give up on default parameters in aliases, then we can reconsider.

alexcrichton commented 7 years ago

I'm sort of losing track of how this discussion is related to this issue. https://github.com/rust-lang/rust/issues/42774 is the tracking issue for dealing with custom allocators in standard collections. This issue is instead about reverting the change to fuse liballoc/libcollections.

Ericson2314 commented 7 years ago

@alexcrichton Sorry for increasing the scope of this, but they do overlap in that before the merge HEAP and collections did not live in the same crate, and yet they still would if we just move Alloc out into core.

To move towards finishing up this issue. I now suggest:

Your idea: move Alloc and friends to libcore. I don't really like it alone as the end result, but I think we all agree it is still better than the status quo.
Moving everything remaining but HEAP out of liballoc into libcollections, so liballoc just deals with global allocation. Even if collections needs HEAP, we don't want collections to have special powers from being in the same crate as HEAP. @japaric likes this I think, @whitequark seems to think it's pointless as the larger issue is Rust is unsafe/unergomic at fallible allocation (I agree that is a big issue), @brson I'm guessing would like it. You I'm guessing wouldn't.
In #42774 decide whether collections need depend on the new liballoc containing just HEAP.
I'll open a new issue for either feature-gating Alloc in libcore or moving it to it's own crate.

@joshlf I meant Cargo features, not language for those parts of libcore which may not be relevant/supported on all platforms. The portability lint might also work.

alexcrichton commented 7 years ago

I don't think your point (2) is actionable with an absence of discussion about point (3), it seems to assume a particular outcome. This is also notably missing feedback from @japaric and @brson

Ericson2314 commented 7 years ago

Right I don't want don't want to put words in their mouths. I'm guessing/hoping based on the rest of the thread.

(2) Doesn't depend on (3)---there would be "libcollections -> liballoc (just with heap) -> libcore (with Alloc trait)" dependencies to begin with, which would only be changed if (3) went through. I still prefer more little libraries, and the deferred-global tricks used implement Heap are exactly the sort of tricks I'd expect to defer the stabilization of collections were we not to do this. Conversely, with the split crates, collections may find itself just using stable features before those tricks are stabilized.

tarcieri commented 7 years ago

I may be biased coming from the no operating system except for a heap perspective of no_std usage, but to me things like Box and Vec are foundational and one of my biggest pain points as a no_std user has been trying to use them from no_std environments. There are crates I want to leverage from the greater Rust ecosystem with features that depend on Box and Vec, and right now that's not easy.

My singular goal getting allocators working in my code was always to unlock Box, Vec, and String (and in some cases things like BTreeMap), and in that regard I thought the original liballoc/libcollections merger made sense. Having a "batteries included" set of heap allocated types that come free with an allocator is much more important to me than being able to craft my own artisanal data structures, although I could see how that might be different for someone writing a kernel.

whitequark commented 7 years ago

I concur with @tarcieri, but add one more issue: IO traits. These depend on String being available, and lack of any way of using it is seriously harming embedded libraries. Right now there are several crates attempting to provide Read/Write, they're incompatible and some of them are rather limited.

tarcieri commented 7 years ago

Yeah, I've also been bitten by IO traits. Also: std::error::Error

whitequark commented 7 years ago

Why on earth is std::error::Error not in core?

tarcieri commented 7 years ago

It used to be, however my understanding is it has some entanglements with Box. There are some comments about it here too:

https://github.com/rust-lang/rust/blob/master/src/libstd/error.rs#L45

We discussed moving it into liballoc at the error-chain meeting.

rust-lang / rust

Consider reverting the merge of collections into alloc #43112