kyren / gc-arena

Incremental garbage collection from safe Rust
Creative Commons Zero v1.0 Universal
438 stars 36 forks source link

generic arenas #26

Closed dragazo closed 1 year ago

dragazo commented 1 year ago

I ran into an issue where a generic function needs to take a type T as input and internally use an arena based on T. But rust doesn't allow items to reference type parameters of the parent scope, so (I believe) this is impossible currently. Easy solution: allow arenas to be generic. This just adds a little syntax to the make_arena! macro that lets you introduce and use generic parameters by just dumping the entire contents of a parenthesis-wrapped token tree into generic param brackets. So we can now do things like this:

#[derive(Collect)]
#[collect(no_drop)]
struct Test<'gc, T: 'gc + Collect>(Gc<'gc, T>);
make_arena!(TestArena(T), Test(T)); // introduce and then use a type param T

fn test<T: 'static + Collect + Copy>(v: T) -> T {
    let arena = TestArena::new(Default::default(), |mc| Test(Gc::allocate(mc, v)));
    arena.mutate(|_, arena| *arena.0)
}

assert_eq!(test(34), 34);
assert_eq!(test(false), false);

The reason for taking token trees like ( $($t:tt)* ) rather than a comma-separated ident list like < $($t:ident),* > was for two reasons: 1) it allows the left (introducing) side to have const generics like Foo(const N: usize) and 2) it allows the right to have non-trivial type expressions like Bar(<T as Thingy>::Output). It also makes it really easy to parse with macro_rules! since parenthesis mark the boundaries of token trees. There's probably some token munching magic we could do to allow angle brackets instead of parenthesis for this, but I don't think it's worth the effort in this case as long as we document it well.

This also renames the 'a lifetime in the dyn trait decl to 'gc to match existing standards since it is exposed to the root type generic parameters: e.g., you can write something like make_arena!(Foo(T), Bar(Gc<'gc, T>).

I also removed the macro overload without $vis:vis because apparently :vis matches empty string and in that case is equivalent to pub(self) anyway.

kyren commented 1 year ago

Yeah I'm not crazy about the (T) syntax but I completely understand why, and I think this is an important thing to allow.

Is there a "standard" for the format here? Do you happen to know of another crate that has to do roughly the same thing? If we're copying an existing technique I'll feel much better about picking a syntax. I know it's a pretty minor thing..

kyren commented 1 year ago

I also removed the macro overload without $vis:vis because apparently :vis matches empty string and in that case is equivalent to pub(self) anyway.

TIL!

dragazo commented 1 year ago

Ok, well in that case I decided to take the time and write a proper tt muncher for it, so now we use angle brackets instead like normal generics. It's probably more prone to edge case issues. For instance, I didn't build in support for writing trait bounds on the introduced generics, but generic type aliases don't allow those anyway - we'll just need to document that fact. I'll add some doc info about it after we settle on any other style changes. One I was considering was making it more obvious this was a special type alias (which have different generic behaviors) by doing something like make_arena!(pub type MyArena<...> = MyRoot<...>) or similar (as an alternative syntax and deprecating the old one).

dragazo commented 1 year ago

Another thing to consider is whether we want to continue doing the auto-insertion of 'gc in the root type. I think we should keep that for backwards compatability if the root type doesn't have an explicit generic param list, but we could make it be explicit otherwise. So make_arena!(MyArena, MyRoot) and make_arena!(MyArena, MyRoot<'gc>) would be equivalent. That would just clear things up in the case of actually needing to pass params, cause at first glance make_arena!(MyArena<T>, MyRoot<T>) could be confusing since MyRoot is actually defined as MyRoot<'gc, T>. If we want to do that, it would be a trivial change to the tt muncher.

kyren commented 1 year ago

Oh shoot, thank you for doing all that work writing the muncher but I actually think I like the simpler solution more :sob: I should have been more clear before! I'd MUCH rather have something simpler that works, I just wanted to know if there was any standard practice because I'm just really unfamiliar with it. I'm sorry to make you do a bunch of work without explaining it better!

I want to merge this because I think having this ability is important, but I need to take a step back and actually think about whether this macro needs to exist at all. I originally wanted to replace the macro with a trait and GATs, but @Aaron1011, who knows a lot more about this than me (they are a rustc dev), said that it was more compatible to avoid doing this and use a different technique.

There's another PR I want to merge too, #28, which also uses a lifetime transmute on the root type which is what was pointed out is not technically allowed in the current version of the compiler[1], if I'm understanding this right (I'm not, it doesn't apply to references / pointers ofc, I was being stupid). I want to revisit this decision for a couple of reasons, I might not even mind relying on technically disallowed behavior or behavior only defined in nightly compiler depending on what is actually required. I don't want to do these things, but I have something I'd like to do soon that might require.. something like this, I'm not sure yet. Getting rid of this macro or simplifying it might be an important part of it. Let me explain...


Okay, this is a bunch more context for these decisions, this is not as important, feel free to skip, I just want to put this down in writing somewhere while I work on it...

I originally was ready to hand maintenance of this crate over to @Aaron1011 or anyone else who more actively used it, because I haven't used this crate in ages. I still think I should share maintenance and I'm glad I gave them maintenance privileges on the crate so it doesn't have a bus factor of 1, and I would even be willing to bring on more maintainers. Tbh I'm not good at open source stuff at all, so any help is appreciated. I'm a game dev by trade, I'm not even used to PR based development and code reviews really! And speaking of that...

In the last few days I had what might be a Very Good Idea:tm: about how to rework the gc-sequence crate and I want to do a deep dive on it this weekend. I don't know what all will be involved or if I will run into an unsolvable problem, but I think I might have a way to use async functions with await as gc points in a way that actually works today. This idea is not new.. I might have been the first person to think of it but I don't know for sure, but I thought of it way back in late 2018 / early 2019 or something and just sat on the idea for 3+ years because I couldn't manage to make it work.

I stopped updating the luster crate because it was so much of a nightmare to try to make work with combinators, so I think this might actually be a very good thing to explore. It's SO difficult to work with combinators that it made me rethink the entire "branded lifetime" gc technique and is part of the original reason I just lost interest. (Also I started a new game project and had other things going on in my life at the time and there were other reasons, but this was a big part of it.)

I think I might have a way to make all this work, but it will require making many different types root-able, and it will also maybe require having an actual Root trait.. I think it will in fact, so my plans for what to do here matter for how the make_arena macro works or even if it's necessary at all.

My idea for how to make this work is to have a linked list chain of Root-able structs that are basically the reified "stack" of an async function, and I think I can design the API to use this to not be completely insane. If I can make this work I actually might pick luster back up, because I kinda even want to use it in my current project (at least as an option).

I'm gonna try to do this in a branch but it might involve.. I dunno what it will involve, probably taking the gc-sequence crate entirely apart and rebuilding it, and maybe at least parts of the gc-arena crate. This is what I mean when I say "I'm not always so good at open source".. I can't even really figure out how to split this into individual PRs yet, if I get it to work I'm sure I can manage to make it piecemeal.


[1] https://github.com/rust-lang/rust/pull/101520#issuecomment-1252016235

Apparently transmuting lifetimes of types other than simple pointers / references is not technically allowed by the compiler until the current nightly: 1.67, but 1) it worked in this crate before now, and 2) it will work in this crate officially as of 1.67 anyway.

Originally when I wrote this, I actually looked into it and I could have sworn this was already at least tacitly considered to be the way forward, that lifetime parameters would have no effect on codegen, but apparently it was less settled than I had imagined. IF a Root trait is required for the idea I have AND there's not a way to express the Root trait using the dyn trick that @Aaron1011 used, I at least am comfortable relying on this behavior, especially if the Root trait requires or is made vastly simpler by GATs. Personally I'm vastly more concerned about simplicity than I am about compatibility, but I understand that is a tradeoff.

Obviously I am open to discussion about all of this in any case.

kyren commented 1 year ago

I think if you just go back to the very simple syntax, even if it's not ideal, I would be okay with merging this. That way no matter what ends up happening with the experiment I want to try, there will at least be a way to support generics. If everybody hates the simple syntax we can try the more complex tt muncher strategy but I am not thrilled about needing to do something like that, I'd much rather have syntax that I don't love than the additional complication to make a slightly better syntax work, I can live with weird syntax.

dragazo commented 1 year ago

I just reverted back to the parenthesis notation. The more complex one is left in the commit history for reference in case we ever want something like that down the line. The only other issue is whether or not to auto-insert the first 'gc generic param when explicit root type params are specified. Either way is easy (compared to the token muncher), but this way (auto-inserting it regardless) is simplest cause explicit would need one extra (but trivial) macro overload.

kyren commented 1 year ago

I am the literal worst! I was planning on letting you know that there's a PR that is almost ready to merge that supersedes this, #32, and I didn't actually do that when I should have.

The new syntax for the macro is actually vastly simpler and avoids the problem entirely: https://github.com/kyren/gc-arena/blob/39f75f8b5c1ec8c7ed1453e1acb4d8b39ac109b3/src/gc-arena/src/arena.rs#L109

I'm sorry for giving you the runaround, it was not my intention :sob:

BUT if you wanna go take a look at #32 I'm interested in your opinion, particularly anything that might make using the API easier.

If for any reason #32 gets delayed or there are any problems, I will definitely merge this PR, otherwise I'll merge #32 and close this one.

kyren commented 1 year ago

Okay, this should be resolved by #32 now.