garrigue / lablgtk

LablGTK 2 and 3: an interface to the GIMP Tool Kit
https://garrigue.github.io/lablgtk
Other
89 stars 40 forks source link

Gobject Model Crash (invalid access to freed memory) #174

Open bvaugon opened 7 months ago

bvaugon commented 7 months ago

At least from OCaml 5.1.1, lablgtk3 performs invalid accesses to freed memory blocks.

Example :

let column_list = new GTree.column_list in
let column = column_list#add Gobject.Data.string in
let model = GTree.list_store column_list in
for i = 1 to 200 do
  let row = model#append () in
  model#set ~row ~column ("Value " ^ string_of_int i);
done;
match model#get_iter_first with
| None -> ()
| Some row ->
  let rec loop i =
    let str = model#get ~row ~column in
    Printf.printf "model[%d] = %S\n%!" i str;
    if model#iter_next row then loop (i + 1) in
  loop 0

To compile with:

ocamlopt -I +../lablgtk3 -I +unix unix.cmxa lablgtk3.cmxa bug.ml -o bug

If you run it, you will periodically see (at each minor GC run) some random data like:

[...]
model[147] = "Value 148"
model[148] = "Value 149"
model[149] = "\161$\021\251\197U\000\0000"
model[150] = "Value 151"
model[151] = "Value 152"
[...]

Explanation :

This behavior may be finely observed defining environment variable OCAMLRUNPARAM=v=0xFFF and adding some traces in the finalizer (ml_final_gboxed) and in the g_value_get_mlvariant function. You will see that the GValue is effectively freed before its copy by copy_string.

The fundamental problem is that, even if the OCaml variable v is still in the lexical scope of Gobject.Data.of_value when Gobject.Data.of_value calls Gobject.Value.get_conv, it seems to be removed from GC roots (the stack in this case) by the compiler, and is then freed too early.

I don't know if this behavior of the OCaml compiler, i.e. to allow to free blocks that are lexically scoped but no longer accessed, is a well designed and well documented property of OCaml, but if it is not considered a bug by the OCaml team, I suggest to modify lablgtk3, and to register custom blocks as GC roots (using a CAMLlocalX macro) in all lablgtk3 primitives like ml_g_value_get_mlvariant.

Best regards, Benoît.

bvaugon commented 7 months ago

In fact, this bug seems to be quite old and do not need to use OCaml 5.1.1. It is just more difficult to observe using older version of OCaml since reading freed blocks was more rarely dangerous and small freed blocks kept their contents a longer time.

To observe this bug using OCaml 4.14.1, for example, use larger strings. Simply replace the call to model#set by: model#set ~row ~column ("Value " ^ String.make 1_000_000 'x' ^ string_of_int i); and replace the print by: assert (str = ("Value " ^ String.make 1_000_000 'x' ^ string_of_int (i + 1))); and the program segfaults (or raises an assertion failure) just after the first GC run.

Moreover, traces in the finalizer (ml_final_gboxed) and in g_value_get_mlvariant shows that the GValue is finalized before the end of its copy (I mean after getting its length but before copying its contents), as using OCaml 5.1.1.

We can observe a similar problem using OCaml 4.08.1. I didn't go futher in the History.