spk121 / guile-gi

Bindings for GObject Introspection and libgirepository for Guile

GNU General Public License v3.0

58 stars 7 forks source link

Guix, grafts, gee… #96

Open LordYuuma opened 3 years ago

LordYuuma commented 3 years ago

Part of our design goals seems to be building against one version of GLib/GObject/GI, while allowing users to load any version. Today I am here to tell you, that this is extremely broken.

Why Guix?

Guix is just the messenger, but there is a stronger reason behind why GI appears rather broken in any Guix package, and that is grafts.

What are grafts?

Grafts are Guix' way of not rebuilding the world when an important security is rolled out. Basically, they allow you to build and link against old versions of a library while running the program against a new one. Traditional distros do that all the time and you don't even notice, but on Guix you actually have two versions of that library still lying around. The ungrafted one and the grafted one.

Why is this an issue?

Because it is possible to get those two mixed up, e.g. in guix environment. I am not sure, which use cases are affected, but that one surely is. To see the difference, run

./configure --without-gir-hacks
make clean check

once inside a guix environment with grafts, and once in one without them. If you want to use Guix environments to prototype your applications, that means you'll have to use --no-grafts to work around these types of issues for now.

What to do from now on?

It is pretty clear to me, that the main culprit here is a different version of GLib being linked to Guile-GI than the one that should be loaded through Guile-GI. To fix that, we'll probably have to overhaul our entire bootstrapping procedure starting at GTypes. And we'll likely have to preload some version of GObject before defining them. Much fun.

Workaround

If you are working on Guile-GI code inside guix environment and do not wish to be haunted by this issue and how to perhaps resolve it, for the time being add --with-gir-hacks to your invocation of ./configure. If you are experiencing similar issues in your own GI-based projects, consider patching your GIRs in a manner similar to what we do.

LordYuuma commented 2 years ago

I'm not sure whether we can replace the typelibs with their XML counterparts. At the very least, that'd be difficult w.r.t. environment variables. What we could do OTOH is moving from GIRepository's internal representation to the one we actually require (which we could describe in XML, JSON, what have you) by first having a GI-based library write that internally with the rest of our libraries consuming it and then have it generated by a pure C/Scheme implementation of said library. WDYT?

spk121 commented 2 years ago

Well this branch I've been playing with https://github.com/spk121/guile-gi/tree/split-parse-runtime is trying to create an intermediate representation in the hopes of splitting the C library in twain. The branch is totally broken at the moment, but, runs enough to generate, and then parse intermediate code like the following

(require "GLib" "2.0" ("libgobject-2.0.so.0" "libglib-2.0.so.0"))
(type "GArray")
(type-info "%GAsciiType" flags ((alnum . 1) (alpha . 2) (cntrl . 4) (digit . 8) (graph . 16) (lower . 32) (print . 64) (punct . 128) (space . 256) (upper . 512) (xdigit . 1024)) ())
(flag-conversion "AsciiType" #f "%GAsciiType")
(type-info "%GBookmarkFileError" enum ((invalid-uri . 0) (invalid-value . 1) (app-not-registered . 2) (uri-not-found . 3) (read . 4) (unknown-encoding . 5) (write . 6) (file-not-found . 7)) ())
(enum-conversion "BookmarkFileError" #f "%GBookmarkFileError")
(type "GByteArray")
($function "byte-array:free" "g_byte_array_free" 
  ((name . "byte-array:free") (s-input-req . 2) (c-input-len . 2) 
    (pdata
     ((name . "array") (meta (arg-type . GByteArray) (flags ptr in) (transfer . nothing) (params ((arg-type . uint8) (item-size . 1) (transfer . nothing)))) (s-direction . input) (tuple . singleton) (presence . required) (i . 0) (c-input-pos . 0) (s-input-pos . 0)) 
     ((name . "free_segment") (meta (arg-type . gboolean) (flags in) (transfer . nothing)) (s-direction . input) (tuple . singleton) (presence . required) (i . 1) (c-input-pos . 1) (s-input-pos . 1))) 
    (return-val (name . "%return") (meta (arg-type . uint8) (flags ptr out) (transfer . nothing)) (s-direction . output) (tuple . singleton) (presence . required) (i . 0))))

spk121 commented 2 years ago

OK, at this point, the split-parse-runtime branch has split libguile-gi in twain: a libguile-giparse and a libguile-gi. Anything having to do with gobject-introspection of libgirepository is in the former, removing the dependency on libgirepository on the latter.

At the moment, the parser calls the runtime directly, but, by calling set-il-output-port to some port, you can capture a list of function calls that you could then later feed to the runtime to (theoretically) load all the types and functions without having to link to girepository or parse the typelib.

From here, there are just ~140 GObject/GLib calls remaining on the runtime side. These are all present for SCM-to-C conversion for function arguments, or to do GType-to-SCM class conversion. It should be a rote task to dlopen/dlym those at runtime after loading the user's chosen version of GObject/GLib.

spk121 commented 2 years ago

OK. The latest commit at https://github.com/spk121/guile-gi/tree/split-parse-runtime sketches out a solution to this bug. It is very rough, but the outline is all there. It passes most of make check

libguile-giparse and (gi parser) can convert typelib into scheme modules using a sort of intermediate language. This library has to link to GIR, GObject, and GLib.
the gi-parse guild command can use libguile-giparse to make standard Guile modules for typelib
libguile-gi and (gi runtime) can load these scheme modules, parse this intermediate language and make the binding happen at runtime. libguile-gi uses it own FFI and for the hundred GLib/GObject functions it does use, it loads those dynamically from the same version of GObject/GLib associated with the typelib it is loading, and not with a version of GObject/GLib that was present when it was built.

You can still use use-typelibs like in v0.3.2, I think. I'm not 100% sure if using using both the parser and the runtime at the same time -- such as with use-typelibs -- creates the separation that Guix needs, but, I have high hopes that splitting into separate parse and runtime steps should work.

But that tree is a huge mess. It has some ideas I started and later abandoned. I'm going to rewind, rework, and make a sensible patchset in a new branch.

From there, a problems remain

see what can be done with the compile time of these huge generated bindings files. Is it slow because of libguile-gi internals? or because the Guile compiler bogs down at a certain size? On my old laptop, compiling the scheme libraries for a full Gtk stack takes 30 minutes to an hour. Loading the full Gtk stack when a script starts up can take 10 seconds.
apply more thought to the case of when parent classes of a given class come from different typelib namespaces
and the dozens of bugs I probably created along the way

LordYuuma commented 2 years ago

I think we might still be duplicating some work here in that we need to actually read and write data to disk a few times rather than hadling things in memory. The guile language modules provide necessities to build a compile tower. We could hook into that and provide a language specification for gi-scheme, which compiles to either scheme or Tree-IL. This would correspond to what (gi runtime) is currently doing. (gi parser) and gi-parse should probably too sit on that tower with a compilation to gi-scheme being defined. The gi-parse command would then be a simple wrapper around Guile's compile[-file].

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

spk121 commented 2 years ago

I think we might still be duplicating some work here in that we need to actually read and write data to disk a few times rather than hadling things in memory. The guile language modules provide necessities to build a compile tower. We could hook into that and provide a language specification for gi-scheme, which compiles to either scheme or Tree-IL. This would correspond to what (gi runtime) is currently doing. (gi parser) and gi-parse should probably too sit on that tower with a compilation to gi-scheme being defined. The gi-parse command would then be a simple wrapper around Guile's compile[-file].

When I experimented, I found that compile-file makes valid .go but saving the output of compile to bytecode does not. So reading/writing to file in multiple steps may be necessary. The idea of using language is intriguing.

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

This makes sense. I wonder if there are 32-bit/64-bit differences in typelib files. I don't know.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

One could be quite meta, and use the current gi-parse and (gi parser) to bootstrap a gi-scheme for GIRepository-2.0 and its dependencies, and then reprogram the whole of guile-gi's parser using Guile bindings to GIRepository.

LordYuuma commented 2 years ago

When I experimented, I found that compile-file makes valid .go but saving the output of compile to bytecode does not. So reading/writing to file in multiple steps may be necessary. The idea of using language is intriguing.

Note that comile-file uses the language printer of the target file and passes #:to-file? #t.

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

This makes sense. I wonder if there are 32-bit/64-bit differences in typelib files. I don't know.

They do actually describe their file format [1,2]. Only the endianness appears to make a difference, and in a cross-compiling architecture that ought to be the target endianness.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

One could be quite meta, and use the current gi-parse and (gi parser) to bootstrap a gi-scheme for GIRepository-2.0 and its dependencies, and then reprogram the whole of guile-gi's parser using Guile bindings to GIRepository.

I don't quite know how to interpret this. Do you mean we'd only implement enough GI parsing to load GIRepository and then hand things off from there (similar to format, which only supports a smaller number of features until (ice-9 format) is loaded)? If so, I'm unsure if there is such a thing as a convenient, mostly incomplete bootstrap core. I'd rather go with a mostly complete side implementation instead.

But before we're tacking on features upon features, I think it is time to refactor and make what we have currently work in the way we want. This would at the very least also include a lot of (shell) tests for the gi-parse part. Integration tests would also be nice, but I don't think we could put those into CI, can we?

[1] https://developer-old.gnome.org/gi/unstable/gi-GITypelib-Internals.html [2] https://gnome.pages.gitlab.gnome.org/gobject-introspection/girepository/gi-GITypelib-Internals.html