gircore / gir.core

A C# binding generator for GObject based libraries providing a C# friendly API surface
https://gircore.github.io/
MIT License
301 stars 28 forks source link

Generator rewrite #100

Closed badcel closed 3 years ago

badcel commented 3 years ago

If the generator is rewritten the following features should be thought of:

mjakeman commented 3 years ago

I have a few ideas for this. I think the main point is to give us access to richer information about GObject types by processing the data beforehand (which is necessary for e.g. prefixing interfaces).

Preferably, we'd split the generator up into a few passes:

The first three steps would be done synchronously, while the last step could use async/await so we're not waiting on File I/O.

I think you mentioned wanting to make this a dotnet tool, and I like this idea a lot. If we rewrite the generator, we should implement this at the same time. Would we also merge the build tool into the generator in that case?

badcel commented 3 years ago

I think you mentioned wanting to make this a dotnet tool, and I like this idea a lot. If we rewrite the generator, we should implement this at the same time. Would we also merge the build tool into the generator in that case?

I think we would not merge the build project. The dotnet tool is just able to generate one binding. As we are supporting several libraries we need to call the dotnet tool several times. This is probably the job of the Build project. Other 3rd party library authors perhaps want to build their project in a different way, so we just generate source files with the generator and don't force someone that we build their projects, too.

mjakeman commented 3 years ago

Generator wishlist/Adding for future reference:

mjakeman commented 3 years ago

From #127:

GstMessage and many other structs depends on support for custom Value types in the generator. This will allow us to use ref struct rather than the complicated and unsustainable (maintenance-wise) marshalling I'm using at the moment. This can reduce method bodies from >20 lines to sub-5, as we won't need to explicitly marshal anymore. It's probably much safer this way as well, as there is little chance of memory corruption from forgetting to free or update something.

badcel commented 3 years ago

The generator would need to care about reference handling see #78. We can probably have some helper methods which just take in the information on how to handle the ownership transfer as a bool parameter and generate the bool value directly from the gir. into the call of the helper method.

badcel commented 3 years ago

I just tried to hide the class structs from GObjects (e.g. ButtonClass struct) and move them into the corresponding class (e.g. Button). This results in the problem that the references inside the GIR file are not matching anymore. As in the GIR the class structs are a top level element in the corresponding namespace (e.g. Gtk) and are referenced like Gtk.ButtonClass and not Gtk.Button.ButtonClass.

I wonder if we want to move the elements inside the gir into other locations. This would make probably a lot of things more complicated, similarly to renaming things as the integrity of the gir files referencing each other is not given anymore.

The benefit could be a cleaner API. But I wonder if it is worth the effort as we would diverge from the original API just to hide the class structs. Currently I would probably not change the location in the first public version. Perhaps this is even more of a feature than a flaw.

mjakeman commented 3 years ago

I've hit a dead end with delegates (#119) under the current generator. Since we don't have rich type information, marshalling from an arbitrary type to managed and vice-versa isn't really possible.

I've started prototyping a staged generator (complete rewrite) on branch feature/generator-v2. Early results seem quite promising, as I've already got interface prefixing implemented. I'm having a look at how reparenting could work (probably just a simple rename from ButtonClass to Button.ButtonClass).

I still need to settle on a pattern for how data processing should work, since introspection data uses strings for references. I'm thinking we either:

I've implemented the third approach, as it's the most flexible. It essentially has a namespaced type dictionary which can take in either a qualified ('Gtk.Application') or unqualified name ('Application'), resolve it based on the active namespace, and make it accessible to services which implement the generation logic. It works well enough, but I'm not sure how scalable it is (and string-based APIs aren't nice to use).

I'm thinking it might be worth combining the first and third approach to get something like libgirepository (but written in C#). It means data will effectively be duplicated, but one of them will be raw serialised data, and the other will be processed resolved data, so perhaps it makes sense. The string references would be resolved by the type dict during processing rather than generation, so we can fully decouple generation from earlier stages.

I'll keep investigating into this a bit more, but I think the staged generator is the right way to go, given how easy it makes things like renaming and reparenting.

badcel commented 3 years ago

This looks very good! I did not take a super deep look into the code, but I like what I see and the general direction of your code looks very promising.

I have 2 remakrs:

Regarding "how data processing should work" I agree to you that we should start with the string based references. If we see the need we can use more advanced things. But I think this simplistic approach will be very well suited for this and if done right will be very well maintainable. Everything else would probably blow up the netto lines of code as I see no easy way to parse the gir files in some predefined structure in a simple and easy way.

mjakeman commented 3 years ago

I don't like the ServiceManager as it is some kind of Anti-Pattern. For now this is not relevant as it can be changed / refactored later. I think even if we are done with the generator there will be quite a bit of refactoring, so I would not bother with it. We can fix it then.

Yeah, this is temporary. I want the services to be stateless (static) in the long term. This is an easy way of keeping track of state while I iterate over various ideas.

The 2nd thing is the logger. I think there are enough logging frameworks available. We should probably use a given framework and not maintain those 500 lines of code. I'm quite sure we will find something good. It must not be one of those very well known "beasts" perhaps there is some neat nuget which fullfills our needs. I don't have some special framework in mind or even an idea. But let's keep to our basic mission: Create a Generator. Here applies the same like for the first point. Let's fix it at the end, as it is a minor point and should be not very hard to do.

I looked for a logging library at first but everything I found was unnecessarily complex when all I wanted was Console.WriteLine with colours. If we find something nice, it should be a simple find + replace to switch libraries, so no big deal.


I've been having a look at gtk-rs's gir tool (specifically library.rs and parser.rs) and they also have a separation between serialised and processed data. Based on this, I'm experimenting with separating the Generator into two projects:

It's not that much larger in lines of code, and greatly improves the readability of the Generator. It lets us effectively make a data-driven generator and not have to worry about the GIR file format. I'm a bit worried that by using the serialised classes directly (as we do now), any change in GIR will break the entire generator.

badcel commented 3 years ago

I'm a bit worried that by using the serialised classes directly (as we do now), any change in GIR will break the entire generator.

I'm not strictly against of abstracting away the GIR file. I think from an architectural point of view it makes sense. But it involes some kind of structure in between which needs to be maintained. I believe that the GIR format is well maintained and stable (there are deprecated annotations and possible future annotations). So I assume that the GIR will not change in genral but perhaps add some new features. I'm going to verify this in #introspection.

It's not that much larger in lines of code, and greatly improves the readability of the Generator. It lets us effectively make a data-driven generator and not have to worry about the GIR file format.

This is always a killer feature for me. If we can improve readability it is a must have. But then I would wonder why is the same quality of readability not possible while using the GIR classes? If you have some examplary code which abstracts away the GIR classes available please ping me for a quick check perhaps we can work something out.


Edit: Regarding the GIR file format stability

If we ever change it, it'll be versioned differently Right now, it's version 1.2 And there are no backward incompatible changes planned

badcel commented 3 years ago

Before reimplementing the services as static let's discuss / review this as static services could potentially be as problematic as a static service manager.

Sorry for my interference I just want to make sure that there is as less doubled work as possible.

badcel commented 3 years ago

The new PR for this is #274