kyren / gc-arena

Incremental garbage collection from safe Rust
Creative Commons Zero v1.0 Universal
438 stars 36 forks source link

add derive bound option #23

Closed dragazo closed 1 year ago

dragazo commented 1 year ago

Closes #21. @kyren from further inspection I saw the trait bounds causing issues were not added by the proc macro when the S param was absent, since synstructure doesn't produce where clauses for structs/enums when everything is known (no generics). To fix this, I took your suggestion and added a bound = "<code>" option to #[collect] that overrides the default bounds added by AddBounds::Fields.

My original problem just uses the S param as a container for type aliases that already implement Collect, so using #[collect(bound = "")] solves my problem and i can successfully build again.

Also, when I first started using gc-arena, I found it a bit hard to get started just because of a lack of detail about how to actually derive Collect (since it needs the additional #[collect] attribute, which I had to look up in source for usage). So I also added some details about how to derive it to the Collect docs entry (including its ability to be used on fields, which I had no idea about prior to delving into this issue but is actually really useful for my library).

kyren commented 1 year ago

Should we have the automatic bounds be AddBounds::Generics instead of AddBounds::Fields to match what things like serde do?

Also, I don't know where to put the documentation for the proc macro Collect but I don't think putting the documentation on the Collect trait itself is correct. bytemuck manages to have separate documentation for the derive proc macros and the actual trait, does it work if we just... put the doc comments above this line? https://github.com/kyren/gc-arena/blob/543113426e52dcde72c297903b8a021c713e7e76/src/gc-arena-derive/src/lib.rs#L226

dragazo commented 1 year ago

I'm not sure why there's the convention of putting trait bounds on generics that aren't used tbh. Like in my case S is not Collect, but that's not a problem since we never store a value of type S, just stuff like S::Key which itself is Collect. So maybe that's a good default (though idk the reasoning people do it), but AddBounds::Generics breaks my code so I'd need another #[collect] option to override that as well. But if that's something we want to do, I can add another option for that - it's better to add it now rather than later so we avoid breaking changes for anyone that uses it like I am.

Sure, I just moved the doc to the collect derive macro. Apparently it has to go right before collect_derive. Added an inter-crate link to it when we mention deriving it in the Collect macro (but all other info is moved to the derive crate).

kyren commented 1 year ago

AddBounds::Generics breaks my code so I'd need another #[collect] option to override that as well

It would be the same override option right? We would only add bounds for generics by default, but #[collect(bound = "")] would override it, so you'd be in the same situation you are now, right? My reasoning for changing to AddBounds::Generics is that the code that synstructure generates for that seems way less complex and less likely to cause the compiler to have fits? Basically I'm suggesting your PR exactly but just change Fields -> Generics. If you have a weird bound that you need to specify beyond the default generic ones, just use the manual bounds? I really don't know what best practices are here.

dragazo commented 1 year ago

Oh, so you mean replace AddBounds::Fields with AddBounds::Generics for the no-override case? Yeah, that should be fine for my use case. Although I just tried that out and it produces an error in the gc-sequence crate, so it seems like that would probably be a breaking change for other crates as well. It looks like there are things like Then<S, F>: Collect, but F doesn't have F: Collect so the generic bound breaks the Then: Collect derive. But technically that's ok if we update semver accordingly (and fix the issue in gc-sequence).

kyren commented 1 year ago

I looked more into the AddBounds::Fields functionality from synstructure, here's the link to the original issue requesting it: https://github.com/mystor/synstructure/issues/10

dtolnay chimes in and says that bounds on field types is almost never what you want, and links another issue in syn giving a list of issues that experimenting with this technique caused in serde 0.7: https://github.com/dtolnay/syn/issues/370#issuecomment-371881224

I believe the current serde heuristic is to add a generic bound on the impl for every type parameter in every field which is actually going to be serialized / deserialized, and then allows you to ofc override the bounds when the heuristic fails. It allows this both for the type as a whole, and interestingly also on each field / variant, replacing any bounds produced by applying the heuristic to that field: https://serde.rs/attr-bound.html

Based on my understanding so far, I still think using AddBounds::Generics is the way to go? It seems like automatically adding bounds for each field is more trouble than it's worth. My vote currently is to change it, fix gc-sequence and move on from there.

dragazo commented 1 year ago

Ok, just switched over to AddBounds::Generics and fixed the issues in gc-sequence. It looks like the generic-only default trait bounds were almost right in every case, except that we needed F: 'static instead of F: Collect since we only need StaticCollect<F>: Collect (equiv to F: 'static). Only more complex cases were things like Sequence<'gc>::Output: Collect, but that only happened in two places. So they all have explicit #[collect(bound = "...")], but it works. Also, I tested new derive on my project and everything works fine, so that's at least one non-trivial crate that works without any changes required.

kyren commented 1 year ago

It's a shame that we need more bounds in so many places like that, but I guess this is still the best option? I played with serde a bit just to make sure I understood how it handles this and it seems to be the same way, but there are a lot of fun heuristics on top of it https://gist.github.com/77f765c79c13afb3899c376d51e2eb77

Did you know that if a type is named literally 'PhantomData' it will not generate a serialize bound on generics automatically? I didn't!

Anyway I still think this is the right move it's just a tad more annoying than I would have hoped. I guess we can always get funky with heuristics in the future.

kyren commented 1 year ago

I think other than the small nit it looks good now, thank you for going through all the back and forth here.

dragazo commented 1 year ago

Oh yeah, any time heuristics have to be used for code generation like this it's gonna get really detailed really quickly lol But as long as we're doing what other popular crates do, it's probably fine. Worst case is users have to explicitly set bounds, but that's not so bad either. The big benefit of the macro is making deriving Collect safe, so I at least would be willing to manually input generic bounds for that safety tradeoff (since even if I get the bounds wrong it's not unsound).

Sure, I just unified those error messages under the new usage_error! macro. I also fixed an issue where multiple modes/bounds could be specified (and it would just keep the last one for each). It now properly errors out in those cases and explains with usage info.

kyren commented 1 year ago

Yeah, the error messages are much better now, thank you! Merging!