lgi-devs / lgi

Dynamic Lua binding to GObject libraries using GObject-Introspection
MIT License
440 stars 70 forks source link

Segmentation fault with Vte.Terminal.spawn_async #262

Open sodomon2 opened 3 years ago

sodomon2 commented 3 years ago

Hello

I have made a terminal in lua and I wanted to add tabs but I can not because the spawn_async of vte does not work with lgi,I did some tests and I found that the problem is with spawn_async that makes the segmentation program fault

I just wanted to know if there is any way to solve it :)

Thanks for your work on LGI ❤️ Thank You!

psychon commented 3 years ago

Since you apparently already understand the vte API: Can you provide some self-contained example reproducing the crash? Otherwise, I'd have to write such an example myself and I never used vte before.

Edit: Quick look at vte's API docs say you could be referring to either https://developer.gnome.org/vte/unstable/vte-Vte-PTY.html#vte-pty-spawn-async or https://developer.gnome.org/vte/unstable/VteTerminal.html#vte-terminal-spawn-async. I'll definitely need more information.

sodomon2 commented 3 years ago

@psychon I refer to this https://developer.gnome.org/vte/unstable/VteTerminal.html#vte-terminal-spawn-async

here is the example

#!/usr/bin/env lua
local lgi   = require("lgi")
local Gtk   = lgi.require('Gtk', '3.0')
local Gdk   = lgi.require('Gdk', '3.0')
local Vte   = lgi.require('Vte','2.91')
local GLib  = lgi.require('GLib', '2.0')

local app   = Gtk.Application()
local term  = Vte.Terminal()

main_window = Gtk.Window {
    width_request   = 600,
    height_request  = 400,
    Gtk.ScrolledWindow{ id = 'scroll' }
}

function app:on_activate()
    font = term:get_font()
    font:set_size(font:get_size() * 1.1)
    term:spawn_async(
        Vte.PtyFlags.DEFAULT,                       -- pty flag
        nil,                                -- working directory
        { '/bin/bash' },                -- envv
        nil,                            -- argv
        GLib.SpawnFlags.DEFAULT,                        -- spawn_flags
        1,                              -- child_setup
        nil,                            -- child_setup_data
        nil,                                -- child_setup_data_destroy
        1000,                                       -- timeout
        nil,                                -- cancel callback
        function() print('Hello World!') end
    )
    main_window.child.scroll:add(term)
        main_window:show_all()
    self:add_window(main_window)
end

app:run()
psychon commented 3 years ago

Hm. According to gdb, the arguments are somehow not the expected ones. You specify a timeout of 1000, but the function is called with 1:

(gdb) frame 5
#5  0x00007ffff580ef71 in vte_terminal_spawn_async (terminal=<optimized out>, pty_flags=<optimized out>, 
    working_directory=<optimized out>, argv=<optimized out>, envv=<optimized out>, spawn_flags=<optimized out>, 
    child_setup=0x7ffff7fb4160, child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, 
    cancellable=0x0, callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513    ../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.

I will need to figure out what the actual API of this function in gobject-introspection and lua is. It might be that it has less arguments than the C function (e.g. child_setup and child_setup_data (and perhaps also child_setup_data_destroy) could be magically turned into one callback function).

Edit: This one has less "optimized out":

Thread 2.1 "lua" hit Breakpoint 2, vte_terminal_spawn_async (terminal=0x555555758720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x555555938c00, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513

Edit: From the same run as the above:

Thread 2.1 "lua" received signal SIGSEGV, Segmentation fault.
0x00007ffff7fb4270 in ?? ()

This is a random pointer somewhere after callback.

Edit: Another run:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555759720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558cffe0, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) c
Continuing.
[Detaching after fork from child process 6833]
[New Thread 0x7ffff20bb700 (LWP 6834)]
[New Thread 0x7ffff1607700 (LWP 6835)]

Thread 1 "lua" hit Breakpoint 3, 0x00007ffff7fb4240 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4240,+20
Dump of assembler code from 0x7ffff7fb4240 to 0x7ffff7fb4254:
=> 0x00007ffff7fb4240:  00 00   add    %al,(%rax)
   0x00007ffff7fb4242:  00 00   add    %al,(%rax)
   0x00007ffff7fb4244:  00 00   add    %al,(%rax)
   0x00007ffff7fb4246:  00 00   add    %al,(%rax)
   0x00007ffff7fb4248:  00 00   add    %al,(%rax)
   0x00007ffff7fb424a:  00 00   add    %al,(%rax)
   0x00007ffff7fb424c:  00 00   add    %al,(%rax)
   0x00007ffff7fb424e:  00 00   add    %al,(%rax)
   0x00007ffff7fb4250:  00 00   add    %al,(%rax)
   0x00007ffff7fb4252:  00 00   add    %al,(%rax)
End of assembler dump.

This is executing all-zero memory? The crash then happens when it runs into something that it should not run into, I guess.

psychon commented 3 years ago

I now think that this is a bug in Vte. The callback argument of vte_terminal_spawn_async() has no annotations. According to https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations, scopes are:

Scope types:

  • call (default) - Only valid for the duration of the call. Can be called multiple times during the call.
  • async - Only valid for the duration of the first callback invocation. Can only be called once.
  • notified - valid until the GDestroyNotify argument is called. Can be called multiple times before the GDestroyNotify is called.

Since no scope is given, the scope is call. Thus, Vte may only call this callback before vte_terminal_spawn_async returns. This is clearly incorrect. I guess it should be async instead...?

Could you open a bug report at https://gitlab.gnome.org/GNOME/vte/-/issues? Perhaps the people there will conclude that my reasoning here is wrong. We will see.

Edit: However, there is still something wrong going on. I do not understand what is going on with the child_setup argument, but it also seems to point to all-zeros:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555757720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558e4090, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513    ../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[Attaching after Thread 0x7ffff7c1c2c0 (LWP 7320) fork to child process 7333]
[New inferior 2 (process 7333)]
[Detaching after fork from parent process 7320]
[Inferior 1 (process 7320) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7ffff7c1c2c0 (LWP 7333)]

Thread 2.1 "lua" hit Breakpoint 2, 0x00007ffff7fb4160 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4160,+20
Dump of assembler code from 0x7ffff7fb4160 to 0x7ffff7fb4174:
=> 0x00007ffff7fb4160:  00 00   add    %al,(%rax)
   0x00007ffff7fb4162:  00 00   add    %al,(%rax)
   0x00007ffff7fb4164:  00 00   add    %al,(%rax)
   0x00007ffff7fb4166:  00 00   add    %al,(%rax)
   0x00007ffff7fb4168:  00 00   add    %al,(%rax)
   0x00007ffff7fb416a:  00 00   add    %al,(%rax)
   0x00007ffff7fb416c:  00 00   add    %al,(%rax)
   0x00007ffff7fb416e:  00 00   add    %al,(%rax)
   0x00007ffff7fb4170:  00 00   add    %al,(%rax)
   0x00007ffff7fb4172:  00 00   add    %al,(%rax)
End of assembler dump.
sodomon2 commented 3 years ago

I now think that this is a bug in Vte. The callback argument of vte_terminal_spawn_async() has no annotations. According to https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations, scopes are:

Scope types:

  • call (default) - Only valid for the duration of the call. Can be called multiple times during the call.
  • async - Only valid for the duration of the first callback invocation. Can only be called once.
  • notified - valid until the GDestroyNotify argument is called. Can be called multiple times before the GDestroyNotify is called.

Since no scope is given, the scope is call. Thus, Vte may only call this callback before vte_terminal_spawn_async returns. This is clearly incorrect. I guess it should be async instead...?

Could you open a bug report at https://gitlab.gnome.org/GNOME/vte/-/issues? Perhaps the people there will conclude that my reasoning here is wrong. We will see.

Edit: However, there is still something wrong going on. I do not understand what is going on with the child_setup argument, but it also seems to point to all-zeros:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555757720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558e4090, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513  ../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[Attaching after Thread 0x7ffff7c1c2c0 (LWP 7320) fork to child process 7333]
[New inferior 2 (process 7333)]
[Detaching after fork from parent process 7320]
[Inferior 1 (process 7320) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7ffff7c1c2c0 (LWP 7333)]

Thread 2.1 "lua" hit Breakpoint 2, 0x00007ffff7fb4160 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4160,+20
Dump of assembler code from 0x7ffff7fb4160 to 0x7ffff7fb4174:
=> 0x00007ffff7fb4160:    00 00   add    %al,(%rax)
   0x00007ffff7fb4162:    00 00   add    %al,(%rax)
   0x00007ffff7fb4164:    00 00   add    %al,(%rax)
   0x00007ffff7fb4166:    00 00   add    %al,(%rax)
   0x00007ffff7fb4168:    00 00   add    %al,(%rax)
   0x00007ffff7fb416a:    00 00   add    %al,(%rax)
   0x00007ffff7fb416c:    00 00   add    %al,(%rax)
   0x00007ffff7fb416e:    00 00   add    %al,(%rax)
   0x00007ffff7fb4170:    00 00   add    %al,(%rax)
   0x00007ffff7fb4172:    00 00   add    %al,(%rax)
End of assembler dump.

You mean the problem is with vte and not LGI?

sodomon2 commented 3 years ago

Hm. According to gdb, the arguments are somehow not the expected ones. You specify a timeout of 1000, but the function is called with 1:

(gdb) frame 5
#5  0x00007ffff580ef71 in vte_terminal_spawn_async (terminal=<optimized out>, pty_flags=<optimized out>, 
    working_directory=<optimized out>, argv=<optimized out>, envv=<optimized out>, spawn_flags=<optimized out>, 
    child_setup=0x7ffff7fb4160, child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, 
    cancellable=0x0, callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513  ../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.

I will need to figure out what the actual API of this function in gobject-introspection and lua is. It might be that it has less arguments than the C function (e.g. child_setup and child_setup_data (and perhaps also child_setup_data_destroy) could be magically turned into one callback function).

Edit: This one has less "optimized out":

Thread 2.1 "lua" hit Breakpoint 2, vte_terminal_spawn_async (terminal=0x555555758720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x555555938c00, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513

Edit: From the same run as the above:

Thread 2.1 "lua" received signal SIGSEGV, Segmentation fault.
0x00007ffff7fb4270 in ?? ()

This is a random pointer somewhere after callback.

Edit: Another run:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555759720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558cffe0, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) c
Continuing.
[Detaching after fork from child process 6833]
[New Thread 0x7ffff20bb700 (LWP 6834)]
[New Thread 0x7ffff1607700 (LWP 6835)]

Thread 1 "lua" hit Breakpoint 3, 0x00007ffff7fb4240 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4240,+20
Dump of assembler code from 0x7ffff7fb4240 to 0x7ffff7fb4254:
=> 0x00007ffff7fb4240:    00 00   add    %al,(%rax)
   0x00007ffff7fb4242:    00 00   add    %al,(%rax)
   0x00007ffff7fb4244:    00 00   add    %al,(%rax)
   0x00007ffff7fb4246:    00 00   add    %al,(%rax)
   0x00007ffff7fb4248:    00 00   add    %al,(%rax)
   0x00007ffff7fb424a:    00 00   add    %al,(%rax)
   0x00007ffff7fb424c:    00 00   add    %al,(%rax)
   0x00007ffff7fb424e:    00 00   add    %al,(%rax)
   0x00007ffff7fb4250:    00 00   add    %al,(%rax)
   0x00007ffff7fb4252:    00 00   add    %al,(%rax)
End of assembler dump.

This is executing all-zero memory? The crash then happens when it runs into something that it should not run into, I guess.

I don't quite understand

Do you mean that the whole spawn_async is being called wrong?

psychon commented 3 years ago

You mean the problem is with vte and not LGI?

Well... I'd be more careful: I think there is at least one problem with vte. :-)

According to the annotation on its arguments, vte_terminal_spawn_async() may only call its callback argument before returning. However, it is obviously meant to be called some time later. Thus, a scope async annotation is missing (I am not sure if scope async is correct, but I guess so).

I don't quite understand

Neither do I.

sodomon2 commented 3 years ago

According to the annotation on its arguments, vte_terminal_spawn_async() may only call its callback argument before returning. However, it is obviously meant to be called some time later. Thus, a scope async annotation is missing (I am not sure if scope async is correct, but I guess so).

The error may be due to the vte, since the scope async may be null.

sodomon2 commented 3 years ago

Hi @psychon

I have been investigating and the error is not only with vte but with all GTK async methods.

see #241

psychon commented 2 years ago

Fixed via #285. I assume.

sodomon2 commented 7 months ago

Fixed via #285. I assume.

Not at all, the error is still valid today.