PoignardAzur / venial

"A very small syn"
MIT License
192 stars 9 forks source link

Parse `impl` blocks, `type` and `const` declarations #28

Closed Bromeon closed 2 years ago

Bromeon commented 2 years ago

This pull request takes #27 as a base.

Supports parsing of two impl block schemes:

Each block can contain members of three different categories:

Examples:

// Inherent impl
impl<'a, T: Clone, const N: i8> MyStruct<'a, T, N> {}

// Trait impl
impl MyTrait for MyStruct {
    pub type MyType = std::string::String;

    fn new(i: i32, b: bool) -> Self {
        Self {}
    }

    #[attr]
    const fn set_value(&mut self, s: String) {}

    const CONSTANT: i8 = 24 + 7;
}

There are a few open questions on my side:

  1. I plan to also add support for traits. Regarding some of the new API types I added, there's the option to reuse them (forcing users to unwrap options) or to make dedicated, type-safe separate ones for trait and impl contexts. What is your preference? This would also affect the naming (e.g. I'm not happy with TypeDefinition). Examples:

    • Trait: const CONSTANT: i8; Impl: const CONSTANT: i8 = value; either we have 1 type representing both, with optional = and value, or 2 separate types.
    • Trait: type AssocTy: Clone; -- can have bounds, but (at the moment) no default type initializers Impl: type AssocTy = MyType; -- must have type initializers, no bounds
  2. I was thinking to add GenericArg/GenericArgList to represent <'a, T, U, N> -- in addition to GenericParam/GenericParamList for <'a: 'static, T, U: Clone, const N: usize>. Would that make sense? If yes, we could probably remove InlineGenericArgs and directly allow to convert from parameters (unless you say we can't afford that copy).

  3. Is there a reason why the Debug impl for GenericParam does not include the tk_prefix? Might be good to check if it's a lifetime, a const or a type. https://github.com/PoignardAzur/venial/blob/b490dd70226cb2b46d63849154bd06a35ee566d3/src/types.rs#L486-L493

  4. There might be some room to make the naming more consistent (e.g. FunctionParameter vs. GenericParam) as well as other smaller refactorings, but it probably makes more sense to work out a proposal to parse trait first 🙂

Bromeon commented 2 years ago

I realized that the previous structures couldn't represent some syntax; concretely:

So I just used TyExpr. Since that just leaves verbatim tokens, I added a convenience function TyExpr::as_path() that can (partially) parse path::to::Type<other::Arg, 32, 'a>. We don't yet have a type representing generic arguments, so the argument list is left verbatim.

After some feedback I can also clean up the commits a bit.

PoignardAzur commented 2 years ago

I plan to also add support for traits. Regarding some of the new API types I added, there's the option to reuse them (forcing users to unwrap options)

I'd lean towards re-using them.

Note that venial aims to cover anything that is syntactically valid, even if it's not semantically valid. For instance, this compiles:

#[cfg(FALSE)]
const X: i32;

I was thinking to add GenericArg/GenericArgList to represent <'a, T, U, N> -- in addition to GenericParam/GenericParamList for <'a: 'static, T, U: Clone, const N: usize>. Would that make sense? If yes, we could probably remove InlineGenericArgs and directly allow to convert from parameters (unless you say we can't afford that copy).

I guess? I haven't really considered it.

I think we'd want names that differentiate between the two less ambiguously. I like InlineGenericArgs, but I'm open to suggestions.

Is there a reason why the Debug impl for GenericParam does not include the tk_prefix? Might be good to check if it's a lifetime, a const or a type.

I don't remember. Probably not.

Bromeon commented 2 years ago

@PoignardAzur Sorry for the delay. In my initial implementation, a lot of stuff was "half-baked". By doing things a bit more properly, I realized that several things were missing and took the chance to add quite a bit of functionality in the process, while also addressing your feedback 😬 I apologize if this PR got a bit too big.

New Features in this PR:

Also added tests for everything.

I left the commits split, so it's easier for you to follow the changes. I can squash them to only a handful if you prefer.

PoignardAzur commented 2 years ago

Will take a look soon.

PoignardAzur commented 2 years ago

FYI, I'm going to vacation on wednesday (well, kind of, but the gist is I'll have less time for open source) but I'd like to review this before.

Ping me tomorrow if I haven't gone back to you by then.

Bromeon commented 2 years ago

@PoignardAzur the day is not over yet, but here's a ping, to not push it too far into the evening 😉

Bromeon commented 2 years ago

The biggest problems are the incomplete parsers for type and const declarations, but it's always better than no implementation at all, so I wouldn't mind merging it now.

I agree here, but maybe I should highlight missing parts (e.g. as // TODO) so they could potentially be added later (or at least they document the limitations). I can do this for some of the inline code comments, which require bigger changes and would further delay this PR.

Do you want me to give you write access to the repository? Since I don't intend to maintain the project in the coming months, you might want a tighter feedback loop. And while I made a lot of small nitpicks, overall the quality of your contributions seems high enough I don't need to double-check everything.

Thanks a lot for the compliment and the trust, it's appreciated 🙂

Write access might be nice for quicker integration, thanks for offering. Even though, I have to say your feedback in PRs has also been very valuable; I still need to understand some of the conventions and practices of venial. I don't know if you plan on reviewing things after some months, but I guess you could also directly clean up the main branch whenever you see fit (for smaller nitpicks, it's often faster than expressing your thoughts into words, letting me implement them and reviewing them again).

Maybe more generally, would that be more of a hiatus for a couple of months, or are you generally not sure about long-term plans? Are there more people involved or potentially interested in venial's maintenance? With godot-rust and a couple other open-source projects, I could probably implement some things every now and then, but realistically I don't see how I can spend enough time to help review PRs etc. I do see a lot of potential for a lightweight parser though, and know some people who think similarly.

PoignardAzur commented 2 years ago

I agree here, but maybe I should highlight missing parts (e.g. as // TODO) so they could potentially be added later (or at least they document the limitations).

Yeah, that's good practice.

I don't know if you plan on reviewing things after some months,

I still plan on reviewing incoming PRs (there's few enough it's not exactly a drag).

but I guess you could also directly clean up the main branch whenever you see fit (for smaller nitpicks, it's often faster than expressing your thoughts into words, letting me implement them and reviewing them again).

That's the idea. :grin:

Maybe more generally, would that be more of a hiatus for a couple of months, or are you generally not sure about long-term plans?

I don't really have long-term plans for venial. As I've said elsewhere, the initial benchmarks have been somewhat disappointing (it's faster than syn, but not so much that I'd expect mass adoption). I mostly see the project as an informative experiment.

My plan is to make a 1.0 release at some point this year, with an attached post-mortem, and then I'll only fix issues and merge PRs.

Are there more people involved or potentially interested in venial's maintenance? With godot-rust and a couple other open-source projects, I could probably implement some things every now and then, but realistically I don't see how I can spend enough time to help review PRs etc. I do see a lot of potential for a lightweight parser though, and know some people who think similarly.

I'm not trying to dump the project in your lap ^^.

There aren't too many people involved in the project (as you can see from the commit history). If you know people who'd be interested in getting involved, sure, that would be helpful, but I don't really expect it.

Again, I don't plan on archiving the project. If people submit PRs, I'll review them. I just don't plan to have a very active role in the future.

(And honestly, given this was only side project I tried to do in a week to distract myself from Druid, I'm pretty impressed with how far it has come)

Bromeon commented 2 years ago

I'm not trying to dump the project in your lap ^^.

No worries, that's not how I understood it, just wanted to make sure you're not disappointed if my contributions decrease in frequency over time 😉

I don't really have long-term plans for venial. As I've said elsewhere, the initial benchmarks have been somewhat disappointing (it's faster than syn, but not so much that I'd expect mass adoption). I mostly see the project as an informative experiment. [...] (And honestly, given this was only side project I tried to do in a week to distract myself from Druid, I'm pretty impressed with how far it has come)

In godot-rust, syn is a major compile time offender (total compile time 206s fresh, of which syn is 18%):

grafik

Now I won't rewrite that all to venial, so it's hard to tell how much difference there really is. But what I am going to try is use venial for the next version of godot-rust (GDExtension binding) and see how far I can push it. I'm a huge fan of lightweight libraries -- could be that I won't reach my goals in the end, but I think venial deserves to get a chance in a larger project 🚀

syn is an extremely powerful and versatile library, but two things bug me in particular:

  1. Libraries tend to be overly liberal in using it and enabling many crate features, so a dependency tree often has almost all features enabled. This in turn means that even libraries that only use the very basic syn configuration pay for the complex parsing.
  2. Most of the times, what's needed is parsing "headers" (attributes, struct fields, function signatures, etc) and not function bodies or expressions. syn however parses everything (see above point). I can imagine that the extra parsing time and all the intermediate allocations to store symbols can add up quite a bit over many crates.

A syn 2.0 could possibly address 1) by modularizing the crates, and 2) by providing lazy evaluation for rarely used items. But this is a mostly speculative idea without benchmarking. venial is a good trade-off here.

Bromeon commented 2 years ago

I addressed your feedback:

The TyDefinition I could extend in another PR, which would add support for traits.

Another thought -- once venial supports a lot more cases than just basic structs and some functions, should we advertise it a bit (Twitter, Reddit etc.)? Maybe after a 0.5 release or so...

PoignardAzur commented 2 years ago

Another thought -- once venial supports a lot more cases than just basic structs and some functions, should we advertise it a bit (Twitter, Reddit etc.)? Maybe after a 0.5 release or so...

Yeah, that's definitely the plan!

Bromeon commented 2 years ago

Do you think we can merge this PR as it is now? 🙂

PoignardAzur commented 2 years ago

Done.

PoignardAzur commented 2 years ago

That comment you added makes me think it might be worth adding an ARCHITECTURE.md file to the project.

Bromeon commented 2 years ago

That sounds like a good idea! I haven't seen the ARCHITECTURE.md convention before, maybe CONTRIBUTING.md is more common as that's the file that contributors will likely read second (after the ReadMe)? Or a link to that file would also work.

PoignardAzur commented 2 years ago

I haven't seen it in the wild too much (or at all?), I'm referring to the kind of documentation described in this post: https://matklad.github.io/2021/02/06/ARCHITECTURE.md.html

I see it as being a bit different from CONTRIBUTING.md (one is more of a checklist, the other is a high-level view of the project) but it's not that important.