Open LordYuuma opened 3 years ago
I've only followed this repo on the side and I don't really understand much about it. I'm only interested in building nice GTK apps using GNU Guile at some point. That said, since you already started explaining, let me ask some uninformed, possibly dumb questions : )
--no-grafts
?--no-grafts
solves the problem already, why would you need to overhaul the entire bootstrapping procedure?--no-grafts
does not cause any rebuilds, you're stuck with the old library version.--no-grafts
is a fix. If development environments are the only thing this bug affects, then fine, but having packaged some GI applications for Guix I am not sure what exactly happens there.Wait, shouldn't Guix grafting also graft guile-gi (it should actually edit the shared objects in the derivation, too)? Then everything would be fine.
Does this problem actually happen when using an installed/dependent guix guile-gi package?
Right now I'm always using a manual git checkout of guile-gi, so that can't be grafted of course (because guix doesn't know that that checkout exists). Also, I recompile it very rarely, even as I update guix (read: it's stale).
I don't want to be the one responsible for a lot of unnecessary rework. Please make sure it's actually required and I've not been making the problem seem worse than it is.
You are not the only one responsible for this. I use Guix myself as a basis for developing this library and am very weirded out by having to resort to such hacks.
There is so far no precedence for Guile-GI packages in Guix, which might also have to do with the fact, that the guile-gi recipe on Guix was rather broken for a long time (is it fixed now? I don't remember). I would assume some weird workaround would be required to get them to run as with all the PyGI and GJS packages.
$ cat run
#!/bin/sh
exec ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/tools/uninstalled-env guix repl -L . "$@"
#exec guix environment -l ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/guix.scm --ad-hoc guile gdk-pixbuf adwaita-icon-theme shared-mime-info -- "$@" ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/tools/uninstalled-env guix repl a.scm
dannym@dayas ~/src/guix-gui$ strace -f ./run main.scm 2>&1 |grep open |grep glib |grep 'libglib.*\.so' |grep -v -- '-1'
[pid 5665] openat(AT_FDCWD, "/gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libglib-2.0.so.0", O_RDONLY|O_CLOEXEC) = 15
[pid 5665] openat(AT_FDCWD, "/gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0.so.0", O_RDONLY|O_CLOEXEC) = 16
In order to debug this, I'd LD_PRELOAD
something that overwrites open
and prevents one of them from opening. That way, hopefully the requestor will fail and then we know who it is:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <string.h>
#include <stdio.h>
typedef void *(*dlopen_t)(const char *filename, int flags);
void *dlopen(const char *filename, int flags) {
void* result;
dlopen_t dlopen = dlsym(RTLD_NEXT, "dlopen");
if (filename && strstr(filename, "xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0")) {
fprintf(stderr, "dlopen %s\n", filename);
return NULL;
}
result = dlopen(filename, flags);
return result;
}
I'm sorry to say that, but I don't get any meaningful results (or even results at all) from adding this to LD_PRELOAD. It doesn't even appear to execute at all.
I do. Result: dlopen is sometimes called without full path (for example: libcairo-gobject.so.2
)! if that is a string literal in some executable file, that is not good--because those references won't be found by the grafter.
New version (inside ~/src/guile-gi/guile-gi-dannym/guile-gi
):
Create block-open.c
:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
typedef void *(*dlopen_t)(const char *filename, int flags);
void *dlopen(const char *filename, int flags) {
void* result;
dlopen_t dlopen = dlsym(RTLD_NEXT, "dlopen");
fprintf(stderr, "dlopen %s\n", filename);
if (filename && strstr(filename, "/") == NULL && strstr(filename, "libcairo-gobject.so")) {
fprintf(stderr, "dlopen blocked %s\n", filename);
result = dlopen(filename, flags);
if (result) {
fprintf(stderr, "and would have been found! Aborting!\n");
int* x = 0;
*x = 5;
abort();
}
return NULL;
}
result = dlopen(filename, flags);
return result;
}
Then compile it via gcc -fPIC -shared -o block-open.so block-open.c
.
Then create run
scipt:
#!/bin/sh -e
LD_PRELOAD="$PWD/block-open.so" guix environment --preserve=LD_PRELOAD -l guix.scm --ad-hoc guile gdk-pixbuf adwaita-icon-theme shared-mime-info -- make tools/uninstalled-env tools/run-guile "$@"
I also edited tools/run-guile
bottom to say:
exec ${top_builddir}/tools/uninstalled-env ${top_builddir}/libtool --mode=execute \
-dlopen ${top_builddir}/libguile-gi.la \
gdb --args "/gnu/store/0l5a4vx5w8xv6xwq7a6s7hc4r1790lvl-profile/bin/guile" "$@"
Then LD_PRELOAD=$PWD/block-open.so ./run examples/button1.scm
.
Then r
.
I got this:
[...]
dlopen libcairo-gobject.so.2
dlopen blocked libcairo-gobject.so.2
and would have been found! Aborting!
(gdb) bt
#0 0x00007ffff7fc724c in dlopen () from /home/dannym/src/guile-gi/guile-gi-dannym/guile-gi/block-open.so
#1 0x00007ffff3a4d7e9 in g_module_open () from /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgmodule-2.0.so.0
#2 0x00007ffff3beaac1 in g_typelib_symbol () from /gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libgirepository-1.0.so.1
#3 0x00007ffff3be4475 in g_registered_type_info_get_g_type () from /gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libgirepository-1.0.so.1
#4 0x00007ffff3c1f350 in gig_type_meta_init_from_type_info (meta=meta@entry=0x4ccdf0, type_info=type_info@entry=0x4c7540) at src/gig_data_type.c:231
#5 0x00007ffff3c1f72a in gig_type_meta_init_from_arg_info (meta=0x4ccdf0, ai=0x4c6cf0) at src/gig_data_type.c:33
#6 0x00007ffff3c23375 in arg_map_apply_function_info (func_info=0x4cb370, amap=<optimized out>) at src/gig_arg_map.c:108
#7 gig_amap_new (name=name@entry=0x4cc6c0 "container:propagate-draw", function_info=function_info@entry=0x4cb370) at src/gig_arg_map.c:69
#8 0x00007ffff3c26d83 in create_gsubr (specializers=0x7fffffffcb28, formals=0x7fffffffcb20, optional_input_count=0x7fffffffcb1c, required_input_count=0x7fffffffcb18, self_type=0x7ffff353d580, name=0x4cc6c0 "container:propagate-draw", function_info=0x4cb370) at src/gig_function.c:377
Reading the source code of gobject-introspection, they do _g_typelib_do_dlopen
in order to actually dlopen (that was inlined).
Aaaand that was patched by Guix.
/* 'gobject-introspection' doesn't store the path of shared
libraries into '.typelib' and '.gir' files. Shared
libraries are searched for in the dynamic linker search
path. In Guix we patch 'gobject-introspection' such that
it stores the absolute path of shared libraries in
'.typelib' and '.gir' files. Here, in order to minimize
side effects, we make sure that if the library is not
found at the indicated path location, we try with just
the basename and the system dynamic library
infrastructure, as per default behaviour of the
library. */
module = load_one_shared_library (shlibs[i]);
if (module == NULL && g_path_is_absolute (shlibs[i]))
{
module = load_one_shared_library (g_basename(shlibs[i]));
}
I suspect it gets into the if
block body.
A kingdom for a g_debug
in there...
I think I have something simpler:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
typedef void *(*dlopen_t) (const char *filename, int flags);
#ifndef DLOPEN_BREAK_KEY
#define DLOPEN_BREAK_KEY "gobject"
#endif
void *dlopen(const char *filename, int flags) {
fprintf (stderr, "dlopen %s", filename);
dlopen_t dlopen = dlsym (RTLD_NEXT, "dlopen");
if (strstr (filename, DLOPEN_BREAK_KEY)) asm volatile ("int $03");
return dlopen (filename, args);
}
That allows you to GDB into the dlopens of any particular shared library simply by defining DLOPEN_BREAK_KEY
. And as expected, it gets called via g_module_open
in the first load-info
of test/insanity.scm
.
Good idea!
I've read through gobject-introspection source code by now and it seems that gobject-introspection upstream take it upon themselves to provide gir files for a few other libraries (like cairo--see gir/cairo-1.0.gir.in
in gobject-introspection
.
But if I understand the patch that Guix did to gobject-introspection's build system (not pasted above) correctly, then they only pick up libraries in the output of the package currently built. Well, currently we are building gobject-introspection
, not cairo
. So it won't pick up cairo. That's why it's in there with a relative path (a fallback of the guix patch).
The reason why it then doesn't fail nicely at startup of guile-gi as it should is because I have cairo in my profile (it was probably propagated by something--I can't remove it).
Other gir files provided in gobject-introspection are: DBus DBusGLib fontconfig freetype2 gio GL libxml2 Vulkan win32 xfixes xft xlib xrandr
--so those will cause trouble eventually if they refer to glib or any other gobjects (some of those definitely do. If such a package is additionally propagated into a profile, you are gonna have a hell of a time finding the problem--as we did).
It's possible to manually specify --fallback-library-path=
(presumably on g-ir-scanner), so it could be a workaround to make gobject-introspection depend on an union of the respective packages (see above), and then specify --fallback-library-path
manually.
In any case, I think this is a Guix bug (at least additionally).
This is somewhat off-topic, but I think the issue becomes clear if you do the following:
$ ls -l $GUIX_ENVIRONMENT/lib/libgobject-2.0.so
$ grep libgobject $GUIX_ENVIRONMENT/share/gir-1.0/GObject-2.0.gir
Perhaps Guix fails to graft typelibs at all? In that case, we might as well file an upstream bug, but the fundamental issue remains, that we can't load any GObject other than the one we're linking against.
Interestingly guix build glib
and guix build glib --no-grafts
also return different results from the two paths listed above. There is probably a glib-minimal
for building packages, so even if we were to fix the grafting issue, the general issue still remains.
What does
readlink $GUIX_ENVIRONMENT/share/gir-1.0/GObject-2.0.gir
say?
FWIW, for me, there seem to be references to absolute paths of libgobject in the typelib
s (and the gir
s):
~/x/gobject-introspection-1.62.0/compile/gir$ strings GObject-2.0.typelib |grep libgob
/gnu/store/xkfc1275h55ynpgfr3wwmzy9707nblwc-glib-2.62.6/lib/libgobject-2.0.so.0
$ strings `guix build gobject-introspection`/lib/girepository-1.0/GObject-2.0.typelib |grep libgobject
/gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgobject-2.0.so.0
I'm 100% sure I found how the weird reference got in. The problem is making Guix find the problem automatically.
It's because:
(1) I don't update my user profile often (2) guix environment "updates" a lot (3) gobject-introspection refers to cairo by relative name (to dlopen); reason: cairo is not a dependency to gobject-introspection; and even if it was, the Guix patch to gobject-introspection as it is now wouldn't pick it up anyway. (4) cairo uses gobject too nowadays (!!!!!), and thus refers to (another) glib (5) cairo was propagated into my user profile ages ago, and is stale (6) There's a cairo reference and thus gobject-introspection loads the cairo library (with the RELATIVE name, see above). It gets the one from (5). That refers to ANOTHER glib. See above.
Next I'm trying to fix Guix's gobject-introspection to FAIL instead of embedding relative references if built inside a guix build container.
Hmm, it appears the gir inside $GUIX_ENVIROMNENT
is already grafted and grafting does actually change stuff, but it's still a different gobject shlib.
Your venture into the depths of cairo is nice and all, but keep things simple and use test/insanity.scm
.
The point is that the "depth of cairo" loaded another version of glib. So that's where another version of glib comes from.
I'm trying test/insanity.scm
now--I only now saw the new file. Thanks.
The thing is, you don't need cairo to load a different version of GLib. It already happens with GObject alone.
I agree.
But what guix's gobject-introspection does is a wrong thing in general, and libcairo-gobject is being loaded by guile-gi right now (note: I did not directly use cairo anywhere). That is not going to end well.
Fixing it might fix this problem and other potential problems, too.
(I tried making gobject-introspection depend on cairo by now. That causes a circular reference. Sigh.)
In any case, trying insanity.scm now.
Ohhh, libguile-gi links to libgobject at compile time, too. Well, that's gonna be a problem if that library is different to the one dlopen'ed...
You'll get cairo through GTK, Pango, and many other graphical stuff and the recipe seems to contain gtk+ even though it is afaik not strictly necessary.
I highly doubt you'll find a fix to this in Guix. It is likelier that whatever change you make breaks something in PyGI or GJS instead, so please be cautious. A patch to dynamically link libguile-gi against GObject etc. would be very welcome on the other hand.
I highly doubt you'll find a fix to this in Guix.
Right now I'll settle for a procedure to flag this problem in Guix--nevermind fixing it.
A patch to dynamically link libguile-gi against GObject etc. would be very welcome on the other hand.
I don't know at all how something like that would look.
gobject-introspection seems to require glib (see below). Things it loads then also depend on glib, but not necessarily on the same version. That is not going to end well.
ldd /gnu/store/64xq4j8b181s6yz7gpg4w8ny3i6r6irk-gobject-introspection-1.62.0/lib/libgirepository-1.0.so |grep glib-
libglib-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0.so.0 (0x00007fb5b5c8f000)
libgobject-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgobject-2.0.so.0 (0x00007fb5b5c30000)
libgmodule-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgmodule-2.0.so.0 (0x00007fb5b5c29000)
libgio-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgio-2.0.so.0 (0x00007fb5b5a5b000)
I don't think that this can be fixed in guile-gi. It can be fixed in gobject-introspection (by changing its architecture), I guess.
ldd
shows dynamic links, or is this a static link? In that case, having libguile-gi linked statically against GLib is a problem. Though either way dynamic linking would just be a crutch. The real issue is that we as libguile-gi mix our GLib into the GLib that the user actually wants to load and that's wrong.
ldd shows what would be loaded when loading this so
. It can only show things that are known without actually executing user code of that so
. That means it shows things that are in the header of that so
. But the loader ld.so
will load those using dlopen when loading that so
, in the course of running an executable. Guix uses ld
's rpath
option in order to make sure those headers always contain full paths.
The real issue is that we as libguile-gi mix our GLib into the GLib that the user actually wants to load and that's wrong.
So does gobject-introspection, and that's just as wrong.
In the end this is the usual recursive definition problem. I don't get why people always do that--that has to lead to problems sooner or later.
For example having a compiler written in the language of that compiler, or xslt (which is a language for transforming xml) specs itself written in XML etc.
In this case, gobject-introspection is supposed to make glib usable in target languages other than C.
In a sane world that would mean that a binding generator doesn't use glib as a non-native input. Or if it absolutely has to (it really shouldn't [1]), then at least it wouldn't expose that glib or any of its contents to target language users (because philosophically, that's just wrong--even if it happens to work sometimes), i.e. it should be a native-input.
But no, gobject-introspection has glib as a regular input. (Trying to move gobject-introspection to native-inputs, I get build system meson does not support cross-compilation
--see https://issues.guix.gnu.org/44244 )
If it has to do stuff like that, you'd think at least it would have two levels: a meta-level where it mangles definitions of glib, and that just happens to use glib for the mangling internally (but not ever expose that glib or any of its objects directly to the user), and another normal level where it actually provides glib to the target language. But no :(
Personally I'd write a replacement for libgirepository that just parses the girs or typelibs on its own for guile-gi (not using glib
or girepository.so
to do it). Then it's much simpler--both practically and philosophically.
Because what gir files do is describe glib at a C level. The interface of the glib is gobjects, but the implementation of glib is C (Pascal got this right in 1970, only to be ignored by almost everyone). So there's no reason for a binding generator to depend on glib at all--it would have to use the gobject interface if it did.
Of course there'll be some "syntactic sugar" provided for the target language depending on the library when it is loaded--but that's pretty much it. I remember pygtk doing this right ("override" files) a long long time ago.
An easy way to catch those problems early-on is to try to cross-compile your program. If it tries the self-reference outlined above, that would cause a failure (which is what you want) because the architectures of the parts don't match.
[1] because eventually it will become part of glib, and what then?
I'm not sure if this adds any useful information to this discussion, but, on Linux, you can pull the results of dlopening in the following fashion. Compile this with -ldl
#define _GNU_SOURCE
#include <link.h>
#include <stdlib.h>
#include <stdio.h>
static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
printf("Name: \"%s\" (%d segments)\n", info->dlpi_name,
info->dlpi_phnum);
return 0;
}
int
main(int argc, char *argv[])
{
void *handle;
handle = dlopen("libgtk-3.so", RTLD_LAZY);
dl_iterate_phdr(callback, handle);
exit(EXIT_SUCCESS);
}
Personally I'd write a replacement for libgirepository that just parses the girs or typelibs on its own for guile-gi (not using
glib
orgirepository.so
to do it). Then it's much simpler--both practically and philosophically.
To guess the level of effort, I built libgirepository
without linking to glib. I get 250 glib procedures that would need stubs. But guile-gi
doesn't use the whole of libgirepository
, so that's an overbound.
But say we circumvented linking to glib, would the underlying libffi
dependency cause similar problems?
My rough estimate was done this way, after removing glib from meson.build
ninja 2>&1 | awk -F '`' '{print $2}' | awk -F "'" '{print $1}' | sort | uniq | grep g_ | wc
Not as far as I know, but I'm personally not convinced that this is the right move here. Philosophically speaking, it wouldn't be much of an introspection if one was doing it from the outside, would it?
I've had a short look at G-Golf and they seem to be doing stuff similar to us in that they partly export the base GLib that they get, but I assume this is fine for them, since they use dynamic links for GObject. That being said, I still don't have any experience with using G-Golf as a library, but there seem to be projects built on it in Guix, so it appears likely to work out that way.
TL;DR: Dynamic linking would probably be at least a short-term solution. For the long term we should think about what "introspecting your twin" really means.
@daym Do you count GIBaseInfo
being a registered struct type as "exposing GLib to the user"? Because that's pretty much the only thing I can think of that fits your description here.
Philosophically speaking, it wouldn't be much of an introspection if one was doing it from the outside, would it?
It's doing it from the outside anyway because C has no introspection (and neither does glib, generally--except for small islands). They can be faking it, but that doesn't change this fundamental fact.
But I know what you mean: gobject-introspection itself wants to be a gobject.
But if gobject-introspection itself wants to be a gobject, then it should be made impossible to load another glib using gobject-introspection (making the rest of glib similar to what compiler or shell builtins would be).
@daym Do you count GIBaseInfo being a registered struct type as "exposing GLib to the user"? Because that's pretty much the only thing I can think of that fits your description here.
Yeah.
In the end I don't see how gobject-introspection can work reliably on Guix like that--nevermind guile-gi for the time being.
The easiest way to find out in detail what is what would be to remove glib from the dependencies of gobject-introspection (and their header files) entirely and stub the things below like written below (also remove #include <glib.h>
and #include <glib-object.h>
from the gobject-introspection public interface). Ideally, it should still build and install. Does it?
If not, add glib back to the package dependencies but remove #include <glib.h>
and #include <glib-object.h>
from the gobject-introspection public interface. That should definitely work (but would still be pretty bad). If not, that's definitely very bad.
Object-like glib interface types that even gitypelib-internal.h uses:
typedef struct _GMappedFile GMappedFile;
typedef struct _GList GList;
typedef struct _GITypelib GITypelib;
typedef int GQuark; // this one is even returned as a NON-pointer
typedef struct _GError GError;
typedef struct _GIBaseInfo GIBaseInfo;
And the public interface of gobject-introspection has:
GType g_base_info_gtype_get_type (void) G_GNUC_CONST; // sigh...
Primitive glib interface types which are used and fine to use since they have obvious definitions and shouldn't change (and could be just be defined manually):
#define G_BEGIN_DECLS
#define G_END_DECLS
#define GI_AVAILABLE_IN_ALL
typedef char gchar;
typedef unsigned char guchar;
typedef unsigned char guint8;
typedef unsigned short guint16;
typedef unsigned int guint32;
typedef unsigned int guint;
typedef int gint; // for gboolean
typedef signed char gint8;
typedef int gint32;
typedef unsigned long gsize;
typedef gint gboolean;
I'm all for dynamically loading stuff from glib in guile-gi, but I just want to make sure first that this actually fixes the entire problem. Otherwise the problem will be back, one library over there.
(Also, the "cairo" problem won't vanish: gobject-introspection totally can load yet another glib version when traversing through to cairo, even after all those fixes. And it does traverse there. That could also happen with other libraries like gnome highlevel libs etc. I don't want to single glib and cairo out--it's just an example)
All in all, I wonder if GNOME can be made to work reliably like Guix wants it to at all. After all, gobject.so has a central type registry and thus having two gobject.so loaded (even indirectly) in the same executable is going to make things weird...
Would requiring guile-gi
to statically link to a static version of libglib help?
I think it wouldn't help.
If you did do it (and also did it in gobject-introspection), care would have to be taken not to pass any gobject-introspection-internal things (like GITypeInfo) as gobjects to the scheme user or to any of the dlopened libraries at runtime (it would always have to be wrapped and unwrapped so it's doesn't scare them :P).
But guile-gi then still shouldn't use any static glib type conversion functions to resolve dynamically-loaded objects (for example to convert to gobject interfaces)--so it wouldn't help much.
If anything it would just hide the problem and now make it impossible to debug even with a LD_PRELOAD
.
I think that the goal should be to either
(a) remove the explicit glib depdendency from both gobject-introspection and guile-gi, or (b) make the glib dependency built-in to both gobject-introspection and guile-gi (i.e. don't dynamically-load another glib even if instructed to do so)
I think that we need help from Guix on how this is supposed to be fixed.
Manual require
and load
depends on being able to export GIBaseInfo
s and dealing with them in Scheme code. And since typelib->module
is implemented on top of them… you get the idea.
Also statically linking more stuff is ethically wrong, as Guix would require us to statically link less. Not to mention, that static linking also has security implications on other distros.
But if gobject-introspection itself wants to be a gobject, then it should be made impossible to load another glib using gobject-introspection (making the rest of glib similar to what compiler or shell builtins would be).
Just because we aren't doing it right, doesn't mean that it's impossible. We just have to stop mixing our (not theirs) internal GObject into the stuff we return. If anything is missing in order to do that, it probably some macro, that hasn't been introspected.
In the end I don't see how gobject-introspection can work reliably on Guix like that--nevermind guile-gi for the time being.
You load GObject.Object
et al. through require
and load
instead of referring to G_TYPE_OBJECT
.
The same applies to GIRepository itself, but we can just load that internally from C code without batting an eye.
All in all, I wonder if GNOME can be made to work reliably like Guix wants it to at all. After all, gobject.so has a central type registry and thus having two gobject.so loaded (even indirectly) in the same executable is going to make things weird...
Weird maybe, but definitely manageable. Especially in Guile, where you could use e.g. (@@ (gi Typelib-Version) <GValue>)
.
Also GI does not allow instantiating different versions of the same typelib. So if you ever end up with glib-x.y and glib-a.b both loaded through GI (not once through GI and once through the wrapper as we have it here), then either someone maliciously renamed one of them forming "TotallyNotGLib-2.0.gir" or you've found a pretty major bug in GI.
(b) make the glib dependency built-in to both gobject-introspection and guile-gi (i.e. don't dynamically-load another glib even if instructed to do so)
That's a rather interesting twist on my suggestion to dynamically link them to ensure, that they are the same under most circumstances. (Dynamic linkage against 2.0 and requiring 2.0 through typelib should yield the same result unless your system is broken beyond repair, the problem would then still manifest in 2.0 vs 3.0).
Weird maybe, but definitely manageable. Especially in Guile, where you could use e.g.
(@@ (gi Typelib-Version) <GValue>)
. Also GI does not allow instantiating different versions of the same typelib. So if you ever end up with glib-x.y and glib-a.b both loaded through GI (not once through GI and once through the wrapper as we have it here), then either someone maliciously renamed one of them forming "TotallyNotGLib-2.0.gir" or you've found a pretty major bug in GI.(b) make the glib dependency built-in to both gobject-introspection and guile-gi (i.e. don't dynamically-load another glib even if instructed to do so)
That's a rather interesting twist on my suggestion to dynamically link them to ensure, that they are the same under most circumstances. (Dynamic linkage against 2.0 and requiring 2.0 through typelib should yield the same result unless your system is broken beyond repair, the problem would then still manifest in 2.0 vs 3.0).
So here's another half-baked idea. ;-) If dynamic linking and requiring through typelib yields the same results, another brute force approach could be to actually replace any GLib/ GObject calls found in guile-gi itself with function pointers found (lazily) at runtime using dlopen
and dlsym
calls. And then allow a guile-gi client the option to pass in a path to the version of the GLib/GObject library that is desired as a run-time initialization parameter for dlopen
.
In essence each function call would become
GSList *(*_g_slist_append)(GSList *lst, void *data) = NULL;
GSList *g_slist_append(GSlist *lst, void *data)
{
if (_g_slist_append == NULL)
_g_slist_append = dlsym(libglib, "g_slist_append");
return _g_slist_append(list, data);
}
That is roughly what I had in mind, yes. There are a few caveats, however:
G_TYPE_VALUE
.require
already serves that purpose. Since require
has no return value, it should be safe to call during bootstrapping. (gi oop)
and (gi types)
. In fact, rather than exposing types from there, I think the correct method should be loading them through require
/load-by-name
.Rather than manually calling dlopen
, it might also be possible to do that through GI, ensuring that there won't be any discrepancy. Dynamic linking would in my opinion just be a workaround until we have reworked our bootstrapping in such a manner.
Yes, making it so our interface into gobject stuff is only gobject-introspection's user-level interface would make this work a lot more. However, not everything will work:
I've just found another way this bites us: Using (make <GtkButtonBox>)
and putting a (button:new-with-mnemonic)
into it . Because the former is done by libgobject
directly (using g_object_new
), but the latter is done by gtk
directly, which then uses (another) libgobject
to do it. Hence the objects are not compatible because they are in another gobject registry, respectively.
In order to find things like that, when loading a gi typelib, I would suggest to check the transitive closure of shared objects we want to load. If the transitive closure contains a library basename that is already loaded, for now, log an error. Later on, whatever deduplication/translation we could do would go there, I guess.
Weird maybe, but definitely manageable. Especially in Guile, where you could use e.g. (@@ (gi Typelib-Version)
).
I think the thing above can make even libgtk confused--nevermind guile-gi.
This would be fixed by using libgobject
indirectly. (We can get the shlib through g_irepository_get_shared_library
, if it is NULL
simply assume everything is fine?)
After #90 we don't really need transitive closures for anything. If you mess up your require
/load
order, that's an error on your part, and if you use typelib->module
everything should be fine. And yes, Gtk+ gets confused if you load the wrong GObject, it won't even start then.
And yes, Gtk+ gets confused if you load the wrong GObject, it won't even start then.
It totally can (and should) happen that if a library A has an input Q, and library B has the same input Q, and you use A and B in your program, and then you (guix-) update just B to use Q' (a newer version of Q) and recompile your program, then you have both Q and Q' in your process address space! Q is an internal dependency of A, and by coincidence it's an internal dependency of B, too. Just because the internal dependency of B changed shouldn't change the internal depenency of A--that's what "internal" means. Otherwise the modules aren't really modular (because they then don't isolate from each other).
The question is whether gtk people ever considered this case. I think they didn't (otherwise there wouldn't be the type registry as a global variable in glib--they clearly think that the class names are unique), and that that won't work.
I think Guix should be made to check whether this is the case and error out.
Unfortunately, the profile does not contain the gir files, only the typelibs. That would mean that a profile hook would have to load the typelibs using gobject-introspection in order to find out. Sigh...
While there are many profile hooks for glib stuff, there is none for "gi-repository" typelibs yet.
It is not reasonable to expect the end user to care about those things (programmer, fine), so automated functionality to support this case is in order (probably in guix; maybe in profile hooks, maybe on the host side--who knows).
I wonder how much dlopen
can be made to do stuff that a regular linker does, though. Isn't most of the loading inside ld.so
at runtime anyway? I take it that the duplicate global variable checking isn't, right?
I think Guix should be made to check whether this is the case in a profile hook.
I don't think this can be meaningfully delegated to Guix. However, I do think Guix should provide meaningful typelib handling inside glib-or-gtk-build-system
.
It totally can (and should) happen that if a library A has an input B, and library Q has the same input B and you use Q and A in your program, and then you update just Q to use B' (a newer version of B) and recompile your program, then you have both B and B' in your process address space! This is a kind of internal dependency.
Again, not with GI. In GI, (A, V) = (A, V)
should hold over the duration of a program. Dependencies between typelibs are not internal by their very nature.
It is not reasonable to expect the end user to care about those things.
On the Guix side, the package maintainer would have to ensure, that the typelib paths are set up correctly, but there are different styles of doing that, some serving slightly different purposes. There are no clear answers to this either. My personal opinion is that everything in /bin
should get hardcoded =
typelib paths, but that of course only works for applications, which themselves do not need to expose typelib paths for others.
Do you know whether libdl
can be configured to check this? I mean the linker ld
does duplicate global checking for regular linking--can't it do that for dlopen
, too? Then the check would be automatic and also would work on non-guix.
Because libgobject
has global variables, and if ld.so
loads two different libgobject
s into the same executable at runtime, why doesn't the dlopen
fail because of duplicate globals?
I don't think "globals" still have that meaning when it comes to dynamic linkage. Remember our dlopen
experiments from earlier? Those wouldn't work if we were not allowed to overwrite dlopen
in the first place.
Ok, after more thinking and experimenting, I am convinced that this, for me, was me using guix environment wrong.
This https://gitlab.com/daym/guix-gui/-/commit/410e04f54da874cc89eea216ff0ae082bcf8c28b fixed everything for me!
There were many things conspiring to make this happen, among those:
Some packages in guix have a user-facing package and a package actually used in package definitions in dependencies. In guix environment
, it will pick the user-facing package even if the dependency graph already contains the other one. There will be NO warning emitted.
Not using --pure
and having an old cairo in the user profile is a very bad idea.
If I use a normal guix.scm
all this works JUST FINE!
I want to reiterate to please not unlaterally change guile-gi to dlsym all your glib functions. gobject-introspection doesn't do that either, so it won't help.
And it's unnecessary. When actually building or installing a package, then there (according to Leo Prikler on guix-devel) are never two different libgobjects in the dependency graph.
In my opinion what should be done, if possible, is to change gobject-introspection (probably upstream!) to make dlopen fail on duplicate globals. That would make the lives of people encountering this in the future much easier.
If you still experience problems, please tell us on guix-devel, though.
This https://gitlab.com/daym/guix-gui/-/commit/410e04f54da874cc89eea216ff0ae082bcf8c28b fixed everything for me!
- Some packages in guix have a user-facing package and a package actually used in package definitions in dependencies. In
guix environment
, it will pick the user-facing package even if the dependency graph already contains the other one. There will be NO warning emitted.
This is perhaps just a personal nitpick, but you should probably describe your development environment in scheme. Just create a guix-dev.scm
, which adds a specific version of guile etc. to the package's inputs.
- Not using
--pure
and having an old cairo in the user profile is a very bad idea.
I too am very confused whenever I see Gtk tests miraculously passing even if gtk+ is not an input to Guile-GI. Turns out the terminal has GI_TYPELIB_PATH
set.
I want to reiterate to please not unlaterally change guile-gi to dlsym all your glib functions. gobject-introspection doesn't do that either, so it won't help.
I get where you're coming from, but this is a discussion to be had regardless of Guix. Still, waiting until there is an actually conflicting version of GObject to load through typelibs gives us a lot of time to discuss it.
And it's unnecessary. When actually building or installing a package, then there (according to Leo Prikler on guix-devel) are never two different libgobjects in the dependency graph.
Note, that they're "assuming correct grafting". This is not the case when building Guile-GI itself in a Guix environment (GLib and the GLib used by GI have different inputs and get grafted differently and I have no idea why), hence why you need --no-grafts
to pass insanity.scm
. For the record, this does work on your end with guile-gi as input to your package, does it?
In my opinion what should be done, if possible, is to change gobject-introspection (probably upstream!) to make dlopen fail on duplicate globals. That would make the lives of people encountering this in the future much easier.
As Leo Prikler pointed out, consumers of GI can check whether a typelib tries to add duplicate entries into the type registry. I tried consolidating those entries, but that doesn't turn out too well.
This is perhaps just a personal nitpick, but you should probably describe your development environment in scheme. Just create a guix-dev.scm, which adds a specific version of guile etc. to the package's inputs.
Thanks! That makes sense.
I don't think a regular developer that just happens to use Guix would create even guix.scm
though.
guix environment
command line has some weird design decisions.
I too am very confused whenever I see Gtk tests miraculously passing even if gtk+ is not an input to Guile-GI. Turns out the terminal has GI_TYPELIB_PATH set.
Aha! That should not happen!
Filed bug 44414 in Guix about that.
For the record, this does work on your end with guile-gi as input to your package, does it?
It seems to work just fine. Note that I'm using ./run guix-gui.in
, so I DO NOT even do guix package -i guix-gui
or guix build guix-gui
.
As Leo Prikler pointed out, consumers of GI can check whether a typelib tries to add duplicate entries into the type registry. I tried consolidating those entries, but that doesn't turn out too well.
If two libgobjects are loaded, the type registry is also duplicated. If loading libgobjects is unsupported, something should prevent it. Since it can also happen as an indirect dependency of a dlopen of something else, the right one to prevent it is the loader in the OS that dlopen uses.
This is perhaps just a personal nitpick, but you should probably describe your development environment in scheme. Just create a guix-dev.scm, which adds a specific version of guile etc. to the package's inputs.
Yeah, seems to be much better.
I don't think a regular developer that just happens to use Guix would do that, though.
As an irregular developer, I discovered that option when I needed to use some rather unconventional Emacs packages for my development environments. ;)
guix environment
command line has some weird design decisions.
No comment.
Aha! That should not happen!
Filed bug in Guix about that.
I don't think there's much Guix can do about that, since gnome-terminal
is implemented using (I believe) python-gobject
. That's just why --pure
is even more important!
For the record, this does work on your end with guile-gi as input to your package, does it?
It seems to work just fine. Note that I'm using
./run guix-gui.in
, so I DO NOT even doguix package -i guix-gui
orguix build guix-gui
.
That seems to severely limit this issue then. Please do inform us once you at least try guix environment --ad-hoc guix-gui
.
As Leo Prikler pointed out, consumers of GI can check whether a typelib tries to add duplicate entries into the type registry. I tried consolidating those entries, but that doesn't turn out too well.
If two libgobjects are loaded, the type registry is also duplicated. If loading libgobjects is unsupported, something should prevent it. Since it can also happen as an indirect dependency of a dlopen of something else, the right one to prevent it is the loader that dlopen uses.
That's a cosmetic issue at best. If the expected GType according to your registry is different from what you get, you don't care what others out there say, do you?
That seems to severely limit this issue then. Please do inform us once you at least try guix environment --ad-hoc guix-gui.
Sure, but I don't have a package guix-gui
inside guix master repository. It will be a long time until I do.
The ./run
uses something like guix environment --pure -l guix.scm
, though.
That's a cosmetic issue at best. If the expected GType according to your registry is different from what you get, you don't care what others out there say, do you?
I have to think about it, but I think it's still a problem if two other libraries use a respective different glib. Then one registers in one and the other in the other and no matter which of the libgobjects your program uses to read a repo it won't be able to use both libraries it wanted to in the same process.
And to add to the previous topic:
guix could check and warn on same-name different-package inputs in the same profile on the host side declaratively. I've suggested that to Guix--let's see.
That seems to severely limit this issue then. Please do inform us once you at least try guix environment --ad-hoc guix-gui.
Sure, but I don't have a package
guix-gui
inside guix master repository. It will be a long time until I do.The
./run
uses something likeguix environment --pure -l guix.scm
, though.
You can also do --pure --ad-hoc -l guix.scm
, for the record. That should build guix-gui as Guix package and add it to your environment. Not sure if you have a proper launcher yet, but it's not a problem on the Guix side.
That's a cosmetic issue at best. If the expected GType according to your registry is different from what you get, you don't care what others out there say, do you?
I have to think about it, but I think it's still a problem if two other libraries use a respective different glib. Then one registers in one and the other in the other and no matter which of the libgobjects your program uses to read a repo it won't be able to use both libraries it wanted to in the same process.
That is correct, you can't mix the two. But for the purpose of Guile-GI, that does not matter. We would "simply" ignore anything in our local registry and use the one provided through GI via the methods GI provides.
And to add to the previous topic:
guix could check and warn on same-name different-package inputs in the same profile on the host side declaratively. I've suggested that to Guix--let's see.
Guix can actually already do wilder things. During union-builds – including profile generation – it checks for duplicate files and resolves them somehow. You just need to crank up its verbosity if you want to get the warnings, that arise from that.
For the record, I tried testing using a custom Gir for tests and it passes most of the cases, but bind-property
seems to have some type confusion going on. Not quite sure if that comes from us or GI directly.
I think I just proved this to be our fault.
Well, this has been sitting here for a while. As someone who rarely uses Guix, I have a simplistic understanding, but, I gather that a step toward improvement could be something like this:
That would be a big step towards improvement, as far as I can see, yes. The open question (whether libgirepository allows us to load a GObject typelib other than the one it was linked against) does remain, but there's no way of answering that other than trying to implement such a thing.
Well, I did do some hacking in a branch called type_init_noodle2
It doesn't solve the problem but it does take a couple of small steps in that direction
require
. There could be a two-step initialization: those classes that don't depend on GLib/GObject versions could happen first, the rest could happen later.It does appear that to solve the problem, the right approach was as previously suggested: reimplementing GIRepository itself.
As a stylistic choice, I think we ought to avoid "_pub.h" and instead put private implementation details into a _private.h, as is the normal GNOME convention. I haven't had too much of a look at the internals, but structurally speaking, this might be a improvement. Yet, as far as linking is concerned $(GOBJECT_INTROSPECTION_LIBS) $(GLIB_LIBS) $(GOBJECT_LIBS)
means we're still at square 1.
OK. From here, split the libguile-gi.so into two separate shared libraries: a preprocessor of typelib, and a runtime.
The preprocessor will link to libgirepository
, parse typelibs, and output intermediate information for gig_type_define
, gig_function_define
, etc. This intermediate information is Scheme. I'll borrow the term IL
for it. Later, this whole process could be replaced a GIR XML parser.
The runtime will only depend on libguile
and libffi
at link time, but use GLib and GObject via dlopen
. It will load the IL, and use dlopen
and libffi
to construct all the necessary. This would mean
require
call.dlopen
and dlsym
to get the set of GLib/GObject functions needed for SCM to C argument conversionGType
numerical values are not safe, the IL would have the GType names instead.The number of GLib functions required for SCM/C arg conversion stands at 140 in my hacking branch, down from 179 in version 0.3.2.
nm --undefined-only .libs/libguile-gi.so | grep ' g_' | grep -v '_info_' | grep -v 'irepository'
This would require replacement of libgirepository functions
GigTypeMeta
's use of GICallableInfo
and GIEnumInfo
, and GigFunction
's use of GIFunctionInfo
Part of our design goals seems to be building against one version of GLib/GObject/GI, while allowing users to load any version. Today I am here to tell you, that this is extremely broken.
Why Guix?
Guix is just the messenger, but there is a stronger reason behind why GI appears rather broken in any Guix package, and that is grafts.
What are grafts?
Grafts are Guix' way of not rebuilding the world when an important security is rolled out. Basically, they allow you to build and link against old versions of a library while running the program against a new one. Traditional distros do that all the time and you don't even notice, but on Guix you actually have two versions of that library still lying around. The ungrafted one and the grafted one.
Why is this an issue?
Because it is possible to get those two mixed up, e.g. in
guix environment
. I am not sure, which use cases are affected, but that one surely is. To see the difference, runonce inside a guix environment with grafts, and once in one without them. If you want to use Guix environments to prototype your applications, that means you'll have to use
--no-grafts
to work around these types of issues for now.What to do from now on?
It is pretty clear to me, that the main culprit here is a different version of GLib being linked to Guile-GI than the one that should be loaded through Guile-GI. To fix that, we'll probably have to overhaul our entire bootstrapping procedure starting at GTypes. And we'll likely have to preload some version of GObject before defining them. Much fun.
Workaround
If you are working on Guile-GI code inside
guix environment
and do not wish to be haunted by this issue and how to perhaps resolve it, for the time being add--with-gir-hacks
to your invocation of./configure
. If you are experiencing similar issues in your own GI-based projects, consider patching your GIRs in a manner similar to what we do.