gui-cs / Terminal.Gui

Cross Platform Terminal UI toolkit for .NET
MIT License
9.2k stars 670 forks source link

Create source generator to pimp our enums #3443

Open dodexahedron opened 2 weeks ago

dodexahedron commented 2 weeks ago

Prologue

So... As I've griped about before, enums in c# and dotnet in general just suck.

Unfortunately, they're what we have, though, in the language/SDK, and creating something with the same ease of definition, ease of use, and the same behavior is not a simple task and has plenty of potential pitfalls - not to mention it takes a ton of time and effort to do.

[Roslyn enters stage right]

Roslyn: Hi, folks! I'm here to solve all your problems and replace them with different ones!

[End scene]

So yeah...

EpiPrologue

We have plenty of enums in the project code base.

And don't get me wrong - That's not bad or wrong on the part of anyone who has made or used one, since there aren't many alternatives and the ones that do exist aren't particularly well-known (BitVector32, for example), and aren't drop-in replacements, due to other limitations or just realities of the language and SDK.

For enums as a whole, I'm not going to re-hash all the bullets I've mentioned elsewhere about what is unfortunate and insidious about them, except to say that a better way certainly is possible, especially in the areas of run-time performance and design-time quality of life.

The first generator I put in, in #3438, is a step in that direction, specifically for performance, and there are additional things I will likely still add to it, as well, unless it turns out that this obsoletes that, which it very well may do, ultimately, depending on how much I can make this one do without too much extra work.

But, Flags enums.....

Non-critical section - Expand to see complaints about enums

Flags enums, in particular, are something I very much want to make better in a lot of ways, due to runtime costs (at least without really cumbersome means of avoiding them), as well as other unfortunate stuff like:

  • Extension methods are common, and often for similar or the same purposes, but are also often inconsistent, and are not easily discoverable because they live in separate, sometimes multi-purpose, classes.
  • Conditional constructs involving Flags enums are super cumbersome, and are also something that often lead to extension methods being written to make code easier to deal with, but end up making their already bad performance even worse in many cases, usually by at least 2X (we're still talking about fast stuff though - but it all adds up).
  • Explicit values involving the MSB being set to 1 are really unintuitive unless the enum is unsigned...which then leads to them being defined as unsigned. And that's not bad, per se, but it does lead to inconsistency, which can be annoying in some cases and require work-arounds or result in deceptively expensive run-time behavior.
  • Certain rules and guidelines around design and use of enums - especially Flags enums - are not always appropriately followed, such as the definition of the default (0) value and what it means in use.
  • They're not actually constants in most situations.
  • Flag combinations are easy to mess up and hard to read, and often not in any particular order.
  • While extension methods make them better to use, it's still quite a bit of effort to do nice things, like fluent API implementations (such as all that work @tig did with Key-related stuff to make working with that so much better than it used to be), and that and other things often result in multiple types that each do some of what we want but aren't totally fungible, without a heap of work that is unreasonable to ask of people.
  • Enums, since they can't have methods defined on them, thus are not possible to cast between types that we do not have direct control over, nor (at least directly) to other enum types, which also ends up leading to methods to deal with those conversions, be they private instance methods in a class that needs them or static/extension methods somewhere.
  • Those issues certainly can all be solved, if someone puts in an inordinate amount of effort to design an entire type and accompanying tests and documentation that not only does what is needed, but is robust, efficient, consistent with other types, and still actually as easy to use and maintain as an enum.

    So, I want to basically do that, but do it once in a source generator.

    UX design goals/intentions

    The basic design goals I have, for the developer experience, are simple:

    TG Build/Design-Time Technical requirements / expectations / limits (rough spec)

    This is a list of the environment I'm writing this with the expectation of a user or system who is building/developing Terminal.Gui itself to have. These are likely to significantly loosen by the time I'm done, but shouldn't matter anyway, especially for a TG dev, and I know for a fact that the three of us meet these requirements.

    High-level use/behavior design goals (rough spec)

    How do I envision the generator being told what to do, where to do it, and how?

    By making it use actual enums as its input.

    The design I have in mind, basically has the following workflow/behavior (remember - I want it to be basically seamless and 0-effort on the developer's part, without you having to remember to use it):

    So, in short, you literally would have to do zero extra work to cause these new types to be created for Flags enums, and those types would be immediately available for use even for existing code the first time you load the solution with the new analyzers/generator implemented.

    And, since it's all source gen, it'll be trim-friendly and not result in additional dependencies for Terminal.Gui consumers.

    Structure/Capabilities/Design of generated code (rough spec)

    This part is big, because it's a combination of just a brain dump of my current ideas for implementation as well as intended to be at least a partial spec for the actual generator.

    Lame humor, notes, and general intro to the big list ahead Go take a bathroom break (or I guess read this on your phone? You do you, yo), get a drink, take a nap, or otherwise get comfy however you get comfy, before proceeding. 😝 So...\ What do I plan for the generated code to do/be/have? Here's at least a partial list.\ And remember, this is for ***Flags*** enums, at the moment, which is very relevant to several of the items in the list, even if not stated in-line with them... The top-level bullets are mostly broader concepts of how I envision it working, with sub-lists being excessive detail, explanations of my thought process/reasoning behind them, gripes about enums, justifications, or additional specs, as unnecessary.

    There are of course finer and further details that are open questions right now, and most of the above is also open for comment, suggestion, debate, feedback, etc. and such is welcome and encouraged

    Epilogue[^OMGHeWontShutUp]

    So yeah... That's what I'm thinking.[^MoreLameHumor]

    Where are they now?[^DudeJustStopAlready]

    I am currently in the design phase for a prototype struct to use as my basis for the generated code.

    It is supposed to have a significant portion of the above partial spec (and currently does, though most of it is just NotImplementedExceptions, so I can compile and test things against the general structure).

    That currently lives in a clean project that isn't part of our repo or solution, to ensure I am starting from a place of zero dependencies and to have a fully-working type as a model for comparison for the generator. Once I'm ready to start making the actual analyzers and generator, I'll stick that code in both the Terminal.Gui.Analyzers.Internal.Debugging and Terminal.Gui.Analyzers.Internal.Tests projects.

    Ok, actually done now.

    Time to do a little work and stop hurting your eyeballs.[^SheeshItsAboutTime]

    [^AssignmentOnly]: Forms such as this are only relevant to and only possible in assignment situations, and can avoid an additional cast. T1 may include TEnum, TBacking, int, and/or uint. [^FoShoTypeArg]: A concrete type argument means all potential forms will always have this type in this position. [^TBackingDef]: TBacking means the "underlying type" (official terminology) of the enum. That means int, by default, but could also be uint, if specified. Other types aren't planned to be supported at least initially. [^TSelfDef]: TSelf means the generated type [^TEnumDef]: TEnum means the source enum for the generated type. [^AllTypeArgs]: For these interfaces, T1 and T2 mean, in both orders (T1,T2 and T2,T1), TSelf as well as, where appropriate and feasible, possibly others like TEnum, TBacking, int, and/or uint for example. [^IUtf8Note]: These IUTf8* interfaces bring a bunch of other interfaces with them, so those will happen as well. [^TooMuch]: I might create a simple analyzer to check for a small set of obvious cases of bad values being assigned. There are basically infinite possibilities there, though, so I probably won't bother and will just leave it up to exceptions for a developer to fix their mistake in debugging, since that's already better than an enum or, if I do it, it'll probably only cover things like direct assignments of compile-time constants or something like that. [^MoreLameHumor]: Bet you didn't think I could write something that short, huh? Wait... Does this ruin that? [^RecordEquality]: Note that record types implicitly implement IEqualityOperators<TSelf,TSelf,bool>, and those operators cannot be overridden, so that will be there anyway and will be as defined by the language (value equality of all fields). [^SheeshItsAboutTime]: Yeah, that ruined it... Sorry πŸ˜… [^BigEndianIsCompensating]: Most modern machines are little-endian, anyway, but some ARM variants can be configured as big-endian, and I just don't promise, at this time, that the generators or generated code will be cool with different byte orders. Terminal.Gui already doesn't do that anyway and nobody has complained. I'm actually not even sure if .net 8 can even be installed on big-endian machines any more, unless they can operate in little-endian compatibility modes. [^DudeJustStopAlready]: Maybe this should have been Epilogue. [^OMGHeWontShutUp]: ProEpiloge? πŸ€”

    dodexahedron commented 2 weeks ago

    Also, for the record, I'm actually writing the core implementation of this anyway, for work projects, so I'm spending that effort no matter what. May as well let us benefit from it here, too (plus, I use TG as well, so it's at least triple-dipping and dogfooding πŸ˜…).

    Point is, though, that it is my highest priority here, right now, not only because I already felt the need for it, but because it's also a priority for me outside of this project. πŸ˜ƒ

    tig commented 2 weeks ago

    Do you know of another OSS project using a similar technique to success?

    I love all of this, but reading the above, it feels more like an experiment than a tried-tested methodology. I'm leery of TG being a playground for fundamental science experiments.

    dodexahedron commented 2 weeks ago

    First of all, let me say it is great that you voice these kinds of concerns.\ Please, as always, my first and most important request is for free and open communication, and I'm REALLY hard to offend. XD

    But fear not! I've written another bedtime novel below!

    For your first question, short answer is: "OMG very yes and have you seen---." whoops. Not short. Ok, I'll just do long like you know I will anyway....

    There are some pretty decent ones from reputable people, yes. Plenty of non-free licensed and/or paid ones, as well, as people bank what they can before everyone finally starts doing it in every application of any consequence.

    I actually originally wanted to use some of the stuff from this, which is one of many high-quality projects from a pretty prolific MVP (with a great technical blog, BTW), and I actually do make use of some of his stuff in the generator project already.

    The enum functionality, though, didn't do as much as I wanted, and also added some build dependencies that I wanted to keep nice and clean.

    Some of the conceptual basis for some of the functionality is or was learned or inspired by some of GΓ©rald's articles and/or projects, as well (also MIT licensed, so we're all compatible), and I actually switched to his polyfill library just a day or two ago, which allows me to ditch the polyfills I had been manually writing as the needs arose. It's also a generator and therefore not a runtime dependency, which is beautiful and saved me some boring and annoying work.

    At a much higher level, though, and some history, Roslyn generators have been around for almost 15 years (it's how most language features are implemented nowadays, if they don't need new binary behaviors), and were in public preview for several years until they FINALLY called it general release in .net 7. But it's THE compiler for .net and Visual studio as you know it wouldn't be nearly as powerful as it is in so many things we take for granted without it.

    I've been writing analyzers and generators for various things for a long time, as well, including during the preview period. On top of that, again, it's THE compiler you've been using for quite some time (and is all on github, BTW). So, personally, I'm more than comfortable using it and find it to be one of the most powerful force multipliers I have at my disposal, for software development.

    If you understand the language and you even 10% understand visual studio and msbuild, you've got all you need, aside from the usual learning specifics of an API and all that.

    But yeah, even have a look on NuGet or GitHub if you like. Search for Generator or Roslyn. Tons of stuff out there, in the usual range of "wow that's amazing" to "how is this person allowed to use a computer?"

    Specifically for this particular generator, there's not really anything new, per se, that isn't or wouldn't otherwise be under the general umbrella of "business as usual" for software development, for me. You know - stuff like hunting up the APIs and documentation you need, applying them, writing code, debugging that code, rinse, repeat. πŸ€·β€β™‚οΈ

    But it is something new to this specific project/organization, so I'm being extra verbose and specific, with the intent of providing as much of a 1-stop shop for others who may not have experience with it to pick it up and run with it without having to do all the research and whatnot I've done over the years. Have you seen the sheer volume of comments in the first generator, for example? I basically turned one of the core methods in it into an article all to itself, explaining what and why stuff is happening, all the way down to at least one spot where there's even mid-expression comments in a method chain. πŸ˜†

    As for the verbosity of this issue post, that's also mostly for everyone else's benefit, but also trying to inject a bit of a formal spec into the process, because I've been writing software for 30 years and I long ago came to value explicit clarity and potential redundancy over brevity in that sort of thing for soooo many reasons.

    But, also, it's meant as a showcase of the value and power that this sort of approach can have, when we wield the tools to more of their potential.

    Plus it's a written log of thoughts and things that have been considered and whatnot.

    Most importantly, it's intended to be the basis for further design talks, brainstorming, etc., even not necessarily related directly to this specific component, because there are plenty of other places we could benefit from (and that goes for pretty much every project out there).

    But also, realizing that not everyone has experience with the nuts and bolts of the generators themselves, I'm putting some extra effort into making the first few big ones be as simple to use (and therefore also more likely TO be used) for everyone, without them ever having to touch Roslyn directly, if they don't want to, but still reaping the benefits (just like installing a plugin in VS or a nuget package).

    But yeah... Roslyn is mature. .Net since core 3 wouldn't be what it is without it.

    The flow of things in the API is really the "hardest" thing to get a handle on, IMO, but the actual concepts aren't any more difficult to grok than the language itself, because you're just writing code to...well...write code! :)

    But I'm also, as I pointed out a few times in the big spec post, not designing the generators for a first year CS student who never wrote a line of code in their life. It's designed for us, first and foremost, so it's going to hold your hand in ways I may assume are helpful, but isn't going to baby you on things that are basic enough to be unacceptable in production code anyway and which unit tests can trivially catch long before that.

    In (kinda/relative to above) short:

    dodexahedron commented 2 weeks ago

    Or, for one of the central points, but as the "kids these days" might say it...

    #NoFilter #LegitLitBasedNoCap #ITotallyShipRoslyn #SlayItGirl

    dodexahedron commented 2 weeks ago

    Also, as a simple matter of personal and professional ethics, I wouldn't perform or make any experimental or otherwise dangerous acts or contributions to any project, on their turf or my own, without making it unambiguously clear at the very minimum, and ensuring acknowledgement of it.

    I've stuck to that my entire career, always have, always will, and wouldn't have it any other way and one of my biggest peeves is unethical behavior in any context. You can put a stamp on that. πŸ™‚

    After all... I even sign my commits. XD

    dodexahedron commented 2 weeks ago

    Here's a bit of a peek inside what motivations I have, aside from the simple utilitarian fact that i use and like Terminal.Gui, and that I like to do nerdy stuff:

    So, playing devil's advocate, a bit, and being really broad with the meaning of "experimental," I suppose a loose argument could be made that the sheer level of simplicity I'm intending to creat this particular generator with is more than I usually would do, maybe.

    But, that's entirely because 1) I know exactly who and what will consume stuff I write for internal consumption, and can and do skip some of the frills, if they're not worth my time, and 2) Because it's my hope that spending that effort on at least one slightly more interesting one of these here (in addition to the heavy documentation of the intro one I added earlier) can serve as a resource to help others - including yes, us, but also beyond just this project. After all, this is the internet, and these PRs and issues already show up in some Google searches for related things which made me chuckle.

    There are some scattered articles out there on the topic. Some of them are good and i wish they were around years ago. Approachable, complete, up-to-date, or even correct articles on topics like this one are rare as men's teeth and don't often rank high because they write good stuff, rather than spend their effort on SEO, like the mediums and quora of the world, so it's a gap that can be served while also making material project contributions. πŸ™‚

    And I think it's really valuable when a "real" open source project like this one has more than basic api documentation for more advanced features and concepts, because that is almost completely non-existent out there, and people want to learn from real stuff that real people get real value out of, which is next to impossible if all that exists is even relatively good code and just API docs. All the stuff in between is valuable and important, too, even though it may be rote or mundane to most of us who have been doing this for years or when inevitable project fatigue starts to set in and people just wanna get things DONE. πŸ˜…

    In fact, directly to that point, I see and answer questions almost every day - mostly from like 18-22 year olds but also from people who picked up programming at any point in life and want to learn more - of the sometimes anniyingly repetitive "where can I find open source projects to learn from?" nature. And I really empathize with that (once I stop rolling my eyes that they couldnt see the same question posted 2 posts below them....), since even the level of documentation that we already have in Terminal.Gui is not common and was even less common when I was starting out.

    So, it's one of my ways of giving back to the community at large in a way that doesn't get done very often since it's not fun. I don't find it fun, either, but I also don't mind doing it. If just one person sees it and a light bulb turns on, it was time well spent.

    Anyway, there's you a small slice of how I operate. πŸ™‚

    dodexahedron commented 2 weeks ago

    And those were all written on my phone, so apologies for the random typos. πŸ˜…

    tig commented 2 weeks ago

    Greats stuff @dodexahedron. I'm really valuing your contributions... not just to the project by my brain. Hugs.

    dodexahedron commented 2 weeks ago

    Can't believe I forgot to link this in my original post along with Roslyn's speaking part.

    dodexahedron commented 2 weeks ago

    I have a couple of revisions for the spec after reading it over now that a couple days have passed.

    Nothing major really. Just some redundancies and a couple of conflicts I noticed, such as some tweaks to the interface list.

    I'll update that in a few minutes, once I'm at the machine I started that branch on.

    Any thoughts/requests as of yet?

    dodexahedron commented 2 weeks ago

    I am tracking actual work on this in this project

    dodexahedron commented 2 weeks ago

    I had a thought related to some of the interfaces I initially listed above, now that I'm writing the prototype/template struct...

    What does it really mean to say that one Flags enum is greater than another? Is it even relevant? And, since they are backed by an integral type, what does it mean to compare two different enums? Is that even relevant? Could either of these potentially be helpful?

    There are a few ways I can think of for comparing two enums, whether they're the same type or not, when talking about IComparable, which returns 3 standard values for less than, equal to, and greater than, which the runtime uses for built-in sorting methods and such:

    Only the second or MAYBE third case really seems useful at all, to me, so I'm actually tempted to drop IComparable from the spec. Especially since anything but ordering by numeric value is non-standard behavior insofar as IComparable is typically expected to work/mean.

    Alternatively, to actually make it useful, in an opt-in sort of way, an attribute could be defined for relative ordering of the enum members, which would then get incorporated into the IComparable method implementations created by the generator. But I still don't see that being something that would get enough use to be worth it. Plus, I'm generating types as partial, so you could just add that yourself if you wanted it. Or a future generator can be written do exactly that. πŸ€·β€β™‚οΈ

    Any thoughts/opinions on that?

    I'm leaning nuke it.

    dodexahedron commented 1 week ago

    I had another thought as I was writing some skeleton code for the exemplar generated struct around string formatting and parsing.

    To provide full functionality in AoT and trimmed situations, whether built by us ir by a cobsumer of the library, at least some source generation around enums is necessary for full compatibility, too, because parsing enums from a string value as well as formatting them as strings relies on reflection, as the methods called to get the values and labels depend on reflection and are part of that process.

    Without hard-coded alternatives, the no-reflection mode of AoT just can't do that, and aggressively trimmed assemblies would lose it, too.

    The other generator I wrote was intended to gain that functionality at some point, as well, though this one will have it from the start, so the other may just get obsoleted by this one altogether.

    In any case, I'm being very conscious of avoiding reflection of any kind at all for the generated code, to be one less blocker for full AoT/trimming friendliness of Terminal.Gui.