rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.96k stars 1.57k forks source link

Implicit Garbage Collection #3701

Open mrgzi opened 1 month ago

mrgzi commented 1 month ago

Although I know this idea will get downvoted, I still wanted to share my thoughts. After exploring some crates, I don’t feel that using GC<T> provides a seamless experience for garbage collection in Rust.

I believe that garbage collection should work implicitly. My suggestion is to introduce a *let syntax that allows developers to declare variables managed by the garbage collector automatically:

*let val: String = "example".to_string();

This GC feature could be enabled with a simple configuration in Cargo.toml:

[package]
gc = true

For those who prefer to avoid using GC, Cargo could include a warning system to notify users if any crates rely on garbage collection. This way, developers can make an informed choice when adding dependencies.

I don’t think this idea conflicts with Rust's philosophy of "empowering everyone to build reliable and efficient software". But as it stands, Rust isn’t quite for "everyone"—there's still room to make it more approachable for different use cases.

If this could integrate with the existing borrow checker system, it would be great to see it in Rust.

ChayimFriedman2 commented 1 month ago

This request is useless without describing some way to build that GC support.

If you can come up with an idea that doesn't require to annotate GC'd references and is zero-cost, great, you can suggest it. If it's not zero cost, it's a no-go and definitely conflicts with the goals of Rust. If it requires annotation for references, it's easy (and fun fact: Rust used to have GC'd references pre-1.0) but it raises the questions whether this is better for library, does it worth the (implementation and language) complexity, etc. and will likely require extensive discussion before accepted, if ever.

petar-dambovaliev commented 1 month ago

Although I know this idea will get downvoted, I still wanted to share my thoughts. After exploring some crates, I don’t feel that using GC<T> provides a seamless experience for garbage collection in Rust.

It depends on who's idea of seamless we are talking about I believe that garbage collection should work implicitly. My suggestion is to introduce a *let syntax that allows developers to declare variables managed by the garbage collector automatically:

Usually, introducing a feature requires a significant crowd waiting for it, besides a technical justification. I don't see either. *let val: String = "example".to_string(); This GC feature could be enabled with a simple configuration in Cargo.toml:

[package] gc = true For those who prefer to avoid using GC, Cargo could include a warning system to notify users if any crates rely on garbage collection. This way, developers can make an informed choice when adding dependencies.

I don’t think this idea conflicts with Rust's philosophy of "empowering everyone to build reliable and efficient software". But as it stands, Rust isn’t quite for "everyone"—there's still room to make it more approachable for different use cases.

Yes, it actually stands against Rust's philosophy, in my opinion. Rust isn't for everyone, yes. There is no language that is for everyone. If this could integrate with the existing borrow checker system, it would be great to see it in Rust.

I think, this is a lot of work for negative value. Rust takes time to learn and offers certain guarantees as a reward for your investment. There are already good languages with runtimes that have a GC that can't offer these guarantees for the same reason. I don't see any point in turning Rust into one of those languages. I think, it is safe to say that this will never happen.

mrgzi commented 1 month ago

@ChayimFriedman2

If you can come up with an idea that doesn't require to annotate GC'd references and is zero-cost, great, you can suggest it.

I didn't want to think about this because I still want to think about memory management while coding.

If it's not zero cost, it's a no-go and definitely conflicts with the goals of Rust

I do not understand this. Is Rust zero-cost in every case? For example, if you use Rc, you will have runtime cost. If you define gc = false in the Cargo.toml file, Rust will still be zero-cost, won't it?

but it raises the questions whether this is better for library

That's why I suggest that Cargo should include a warning system to notify developers if any crates rely on garbage collection. But if you think that crate authors will start to use this and there will be fewer crates that don't use GC, doesn't that also mean that developers like to use it? Also, if a crate uses GC, you can suggest they remove it, similar to requesting the removal of unsafe from code.

@petar-dambovaliev

Usually, introducing a feature requires a significant crowd waiting for it, besides a technical justification.

You are right, this issue was just a general idea in my head, and I just wanted to understand why people don't want this.

Rust isn't for everyone, yes. There is no language that is for everyone.

"A language empowering everyone to build reliable and efficient software." This is the definition of the Rust language on the Rust website. Maybe they should change it to: "A language empowering system engineers to build reliable and efficient software."


In general, I still haven't been convinced why garbage collection with implicit annotation is a bad idea.

ChayimFriedman2 commented 1 month ago

Is Rust zero-cost in every case?

It is a principle of Rust that abstractions should be zero-cost, when zero-cost means, as per Bjarne Stroustrup, "you don't use pay for what you don't use (and further, what you do use you couldn't hand-code any better)". Garbage collection, if downgrades the performance of everything else, including non-GC'ed references, does not adhere to this principle.

If you define gc = false in the Cargo.toml file, Rust will still be zero-cost, won't it?

"Disabling" the cost of GC by a configuration option is a moot point IMO. You just fork Rust to have a similar language with GC. So why not really fork it? It also does not scale. Do libraries get a chance to allow GC? If yes, it's not zero-cost, because by including a library you can suddenly make your program (not using GC at all) slower, and furthermore it splits the ecosystem. If not, that means only the top-level application code can use GC, which makes it almost useless.

but it raises the questions whether this is better for library

That's why I suggest that Cargo should include a warning system to notify developers if any crates rely on garbage collection. But if you think that crate authors will start to use this and there will be fewer crates that don't use GC, doesn't that also mean that developers like to use it? Also, if a crate uses GC, you can suggest they remove it, similar to requesting the removal of unsafe from code.

Should've been "than library", sorry. I meant that if you have to annotate every GC reference, it's no better (Edit: It can be better, I just mean you have to explain why) than using a library that provides Gc<> type.

tesuji commented 1 month ago

Perhaps D-lang would meet your expectations. D has garbage collector, and an optional borrow checker.

Changing Rust to be a GC language is a non-starter. Rust is expected to be in the same usability levels as C/C++, from micro-controllers to supercomputers. One would never dare to ask for adding an implicit GC for C/C++.

Could someone close this issue? There're nothing to discuss here.

ChayimFriedman2 commented 1 month ago

@tesuji, please respect the Code of Conduct. In particular,

Respect that people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer.

Also, while adding necessary GC is indeed out of the question, and an optional "toggleable" GC is also likely so, adding a GC with annotated references is very much possible, but will require a very strong justification and discussion.

petar-dambovaliev commented 1 month ago

Is Rust zero-cost in every case?

It is a principle of Rust that abstractions should be zero-cost, when zero-cost means, as per Bjarne Stroustrup, "you don't use pay for what you don't use (and further, what you do use you couldn't hand-code any better)". Garbage collection, if downgrades the performance of everything else, including non-GC'ed references, does not adhere to this principle.

If you define gc = false in the Cargo.toml file, Rust will still be zero-cost, won't it?

"Disabling" the cost of GC by a configuration option is a moot point IMO. You just fork Rust to have a similar language with GC. So why not really fork it? It also does not scale. Do libraries get a chance to allow GC? If yes, it's not zero-cost, because by including a library you can suddenly make your program (not using GC at all) slower, and furthermore it splits the ecosystem. If not, that means only the top-level application code can use GC, which makes it almost useless.

but it raises the questions whether this is better for library

That's why I suggest that Cargo should include a warning system to notify developers if any crates rely on garbage collection. But if you think that crate authors will start to use this and there will be fewer crates that don't use GC, doesn't that also mean that developers like to use it? Also, if a crate uses GC, you can suggest they remove it, similar to requesting the removal of unsafe from code.

Should've been "than library", sorry. I meant that if you have to annotate every GC reference, it's no better (Edit: It can be better, I just mean you have to explain why) than using a library that provides Gc<> type.

If he doesn't want to use a GC crate, he can just use boehm GC and with 2-3 lines in the main file you get an implicit GC.

use bdwgc_alloc::Allocator;

#[global_allocator]
static GLOBAL_ALLOCATOR: Allocator = Allocator;

fn main() {
    unsafe { Allocator::initialize() }
}

This is easy enough, yes?

You can also set finalizers and all that good stuff.

bjorn3 commented 1 month ago

Boehm GC doesn't handle freeing non-memory resources like open files. It also violates one of the Pin invariants (no pinned memory is reused without dropping it first)

As for the implicit garbage collection that OP wants: How would tracing be implemented? In the presence of collection types implemented using unsafe (eg Box, Vec or HashMap) there is no way for rustc to implicitly generate a correct trace implementation. And requiring all types to explicitly implement it would be a huge breaking change and have a big ergonomic hit. Also how would you handle collecting cycles once you have successfully traced all alive objects? Many types depend on all their fields not being dropped yet when their Drop impl runs, but for collecting cycles you have to break the cycle somehow by dropping a field of a value before dropping the value itself. Many GC implementations handle this by splitting Drop into a finalize and dealloc phase. Finalize has to keep the value in a state which is safe to access by arbitrary code, while dealloc can't access gc'able fields. Doing this in rust would be an even bigger breaking change than manual tracing. In other words, implicit GC is completely unfeasible within Rust.

mrgzi commented 1 month ago

@bjorn3 After considering what people have said in this issue, I started to rethink the idea, especially after @ChayimFriedman2 mentioned that it could split the ecosystem.

Now, though, I'm wondering: would it be possible to create a smart pointer like GC<T>, similar to Rc and Arc, that could be implicitly specified with something like the *let keyword? Is it possible to avoid automatic cleanup and pauses by manually triggering cleanup with a method like GC::clean() instead?

Diggsey commented 1 month ago

@bjorn3 I don't think that's the case?

You can use GC_register_finalizer to ensure that both Drop implementations are run before memory is reused, and that any non-memory resources are freed.

bjorn3 commented 1 month ago

You can use GC_register_finalizer to ensure that both Drop implementations are run before memory is reused, and that any non-memory resources are freed.

In the case of cycles that is unsound as one of the Drop impls will see a field in an invalid state. And a value would be allowed to hold a reference to the stack if without GC it would be impossible to deallocate it. Eg because of an Rc cycle, mem::forget or ManuallyDrop.

Also I just came up with another way a conservative GC can cause UB in Rust: If you take a Box, call into_raw, do a ptr2int cast and add a large number to the integer. If GC runs it will deallocate the Box. However if you then subtract the large number from the integer, do an int2prr cast and call Box::from_raw without GC you did get a completely valid Box you can use like normal. With conservative GC however you did get a use-after-free.

Cr0a3 commented 1 week ago

Although I know this idea will get downvoted, I still wanted to share my thoughts. After exploring some crates, I don’t feel that using GC<T> provides a seamless experience for garbage collection in Rust.

I like the idea