kandu / ocaml-fswatch

4 stars 3 forks source link

ocaml-fswatch is not working with fswatch 1.17.1 #1

Closed jjh84 closed 2 years ago

jjh84 commented 2 years ago

Hello.

Recently, fswatch has been updated to v1.17.1 and it seems that it break ocaml-fswatch's c stub. When I compile and run the following OCaml code that you have provided via README.md, I received segmentation fault error.

open Fswatch

let callback events =
  Printf.printf "events:\n";
  Array.iter (fun event ->
    let time = Unix.localtime event.Event.time in
    Printf.printf "  %s %d:%d:%d\n"
      event.Event.path
      time.tm_hour time.tm_min time.tm_sec)
    events;
  flush stdout

let () =
  match init_library () with
  | Status.FSW_OK ->
    let handle = init_session Monitor.System_default callback in
    add_path handle "/tmp/";
    start_monitor handle
  | err -> Printf.eprintf "%s\n" (Status.t_to_string err)

The executiion result.

$ dune exec fs
Entering directory '/Users/user/fs2/fs'
[1]    20221 segmentation fault  dune exec fs

The lldb result

lldb ./main.exe
(lldb) target create "./main.exe"
Current executable set to '/Users/user/fs2/fs/_build/default/bin/main.exe' (x86_64).
(lldb) r
Process 20361 launched: '/Users/user/fs2/fs/_build/default/bin/main.exe' (x86_64)
main.exe was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 20361 stopped
* thread #2, queue = 'fswatch_event_queue', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
    frame #0: 0x0000000100048c99 main.exe`caml_thread_leave_blocking_section [inlined] caml_thread_restore_runtime_state at st_stubs.c:203:43 [opt]
Target 0: (main.exe) stopped.
(lldb) bt
warning: could not find Objective-C class data in the process. This may reduce the quality of type information available.
* thread #2, queue = 'fswatch_event_queue', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
  * frame #0: 0x0000000100048c99 main.exe`caml_thread_leave_blocking_section [inlined] caml_thread_restore_runtime_state at st_stubs.c:203:43 [opt]
    frame #1: 0x0000000100048c99 main.exe`caml_thread_leave_blocking_section at st_stubs.c:249 [opt]
    frame #2: 0x0000000100054493 main.exe`caml_leave_blocking_section at signals.c:175:3 [opt]
    frame #3: 0x0000000100047cba main.exe`cevent_callback(cevents=0x0000600000208000, cevent_num=1, cdata=0x0000600003000000) at stub.c:42:5 [opt]
    frame #4: 0x00000001003505ef libfswatch.13.dylib`libfsw_cpp_callback_proxy(std::__1::vector<fsw::event, std::__1::allocator<fsw::event> > const&, void*) + 429
    frame #5: 0x000000010035caca libfswatch.13.dylib`fsw::monitor::notify_events(std::__1::vector<fsw::event, std::__1::allocator<fsw::event> > const&) const + 1618
    frame #6: 0x0000000100366b60 libfswatch.13.dylib`fsw::fsevents_monitor::fsevents_callback(__FSEventStream const*, void*, unsigned long, void*, unsigned int const*, unsigned long long const*) + 752
    frame #7: 0x00007ff8154b6221 FSEvents`implementation_callback_rpc + 3700
    frame #8: 0x00007ff8154b532e FSEvents`_Xcallback_rpc + 233
    frame #9: 0x00007ff8154b5227 FSEvents`FSEventsD2F_server + 55
    frame #10: 0x00007ff8154b8d7b FSEvents`receive_and_dispatch_rcv_msg + 321
    frame #11: 0x00007ff80dbdd317 libdispatch.dylib`_dispatch_client_callout + 8
    frame #12: 0x00007ff80dbdfd7c libdispatch.dylib`_dispatch_continuation_pop + 453
    frame #13: 0x00007ff80dbf1208 libdispatch.dylib`_dispatch_source_invoke + 2179
    frame #14: 0x00007ff80dbe31cd libdispatch.dylib`_dispatch_lane_serial_drain + 342
    frame #15: 0x00007ff80dbe3dfd libdispatch.dylib`_dispatch_lane_invoke + 366
    frame #16: 0x00007ff80dbedeee libdispatch.dylib`_dispatch_workloop_worker_thread + 753
    frame #17: 0x00007ff80dd94fd0 libsystem_pthread.dylib`_pthread_wqthread + 326
    frame #18: 0x00007ff80dd93f57 libsystem_pthread.dylib`start_wqthread + 15
(lldb)

Test environment

kandu commented 2 years ago

Confirmed.

I just take a look at the git commits of fswatch. From version 1.11 to 1.16, the major api version of it is still -- 11.

But 1.17 changed the major api verstion to 12 and 1.17.1 changed it to 13. So they broke api backward compatibility twice in a very short period.

In general, that means we should release two new ocaml-fswatch version to match the major api versions respectivelly. But api version 12 looks like an abandoned mistake. To release ocaml-fswatch 13-0.1.0 only is a better idea.

kandu commented 2 years ago

But 1.17 changed the major api verstion to 12 and 1.17.1 changed it to 13. So they broke api back compatibility twice in a very short period.

I checked the source code of fswatch again and found that all the changes occurred in its c++ core. The exported c interface is not changed, so api compatibility wasn't broken. In theory, the 11-0.1.0 bindings are compatible with the new versions of fswatch.

I installed fswatch 1.17.1 and build the 11-0.1.0 bindings with it, and on my linux system, it still works and there is only one thread running.

I noticed that this segmentation fault happened in thread #2 on your setup. ocaml runtime doesn't work well with external c stub thread unless all the extra threads are registered to its runtime system.

It seems that there is thread creation, cevent_callback invocation in your setup which may violate the ocaml runtime rules according to the backtrace.

kandu commented 2 years ago

would you mind taking a try.

change the function

void cevent_callback(fsw_cevent const * const cevents, const unsigned int cevent_num, void *cdata)

in stub.c to

void cevent_callback(fsw_cevent const * const cevents, const unsigned int cevent_num, void *cdata) {
    caml_c_thread_register();
    CAMLparam0();
    CAMLlocal2(session, events);
    caml_acquire_runtime_system();
    session= caml_copy_nativeint((intnat)cdata);
    events= caml_alloc(cevent_num, 0);
    for (int i = 0; i < cevent_num; ++i) {
        Store_field(events, i, of_fsw_cevent(cevents + i));
    }
    caml_callback2(*caml_named_value("callback"), session, events);
    caml_release_runtime_system();
    caml_c_thread_unregister();
    CAMLreturn0;
}

at the root path of the project, press make install this will install ocaml-fswatch to your current local opam switch. Then build and run the sample program again, see if it'll work or if the backtrace changes?

jjh84 commented 2 years ago

Thank you. After changing the code, the problem was resolved. It works as expected.

$ fs
events:
  /private/tmp/e 8:4:32

You mentioned that Linux has only 1 thread but 2 on macOS. Is it okay If I apply this patch on Linux OS?

kandu commented 2 years ago

Within ocaml runtime system, it's safe to register a known thread(main thread, registered thread, threads created by Thread.create) and unregister an unknown thread(external thread, already unregistered thred). So it's okay to apply this patch to any system.

jjh84 commented 2 years ago

Do you have a plan to release the patch via opam?

kandu commented 2 years ago

I'll make a PR before tomorrow. So, wait for a while and the new release comes.

kandu commented 2 years ago

fixed in 1898d9c

kandu commented 2 years ago

opam repo progress https://github.com/ocaml/opam-repository/pull/21796