ocaml-multicore / eio

Effects-based direct-style IO for multicore OCaml
Other
556 stars 68 forks source link

Issue with `Logs`, Eio, and multicore #370

Closed d4hines closed 1 year ago

d4hines commented 1 year ago

Hi! When enabling Debug level logging in an application that uses parallelism and Eio, the program immediately crashes with this (somewhat garbled) output:

Saving digest db...heisenbug: [DEBUG] starting run
heisenbug: [DEBUG] starting run
heisenbug: [DEBUG] starting run
heisenbug: [DEBUGheisenbug: [DEBUG] ] starting main thread
starting main thread
heisenbug: [DEBUG] heisenbug: [DEBUG] Monitoring eventfd starting main thread
8
heisenbug: [DEBUG] Monitoring eventfd 10
heisenbug: [DEBUG] Monitoring eventfd 9
DEBUG heisenbug: [DEBUG inf:
                                    SQEs ready: 0
                                    Operations active: 1
                                    Sketch buffer: 0/64 (plus 0 old buffers)
heisenbug: [DEBUGm] mstarting runInvalid_argument("exit: 1 request(s) still active!")) while handling another error
DEBUGtarting runheisenbug: [heisenbug: [WARNING] Uring.exit failed (WARNING] Uring.exit failed (Invalid_argument("exit: 1 request(s) still active!")) while handling another error
heisenbug: [DEBUG] starting run
heisenbug: [DEBUG] starting run
heisenbug: [DEBUG] starting main thread
heisenbug: [DEBUG] Monitoring eventfd 13
heisenbug: [DEBUG] starting runheisenbug: [DEBUG] starting main thread

heisenbug: [DEBUG] Monitoring eventfd 14
heisenbug: [DEBUG] scheduler out of jobs, next timeout inf:
                     SQEs ready: 0
                     Operations active: 1
                     0/64 (plus heisenbug0 old buffers)

heisenbug: [heisenbug: [DEBUG] : [DEBUG] starting main thread

                     heisenbugDEBUG     nvalid_argument("exit: 1 request(s) still active!")) while handling another error
heisenbug: [DEBUG] Monitoring eventfd 16
heisenbug: [DEBUG] scheduler out of jobs, next timeout inf:
                     SQEs ready: 0
                     Operations active: 1
                     Sketch buffer: 0/64 (plus 0 old buffers)
heisenbug: [DEBUG] starting main thread
heisenbug: [DEBUG] starting run
: [DEBUG] Monitoring eventfd heisenbug: [32] 18starting run
heisenbug: [32m Monitoring eventfd heisenbug32mDEBUG] 

heisenbug: [DEBUG] scheduler out of jobs, next timeout inf:
                     heisenbug: [DEBUG] starting main thread
SQEs ready: 0
Operations active: 1
Sketch buffer: 0/64 (plus 0 old buffers)
heisenbug: [DEBUG] heisenbug: [DEBUG] starting main thread
Monitoring eventfd 22
heisenbug: [DEBUG] Monitoring eventfd 23
heisenbug: [DEBUG] heisenbug: [scheduler out of jobs, next timeout inf:
                                 DEBUG] 0
                                        1
                                        Sketch buffer: scheduler out of jobs, next timeout inf:
                                                          SQEs ready: 0
                                                          Operations active: 1
                                                          Sketch buffer: 0/64 (plus 0 old buffers)
SQEs ready: 0Operations active: 1heisenbug: [DEBUG] heisenbug: [WARNING] Uring.exit failed (starting runInvalid_argument("exit: 1 request(s) still active!")) while handling another error

heisenbug: [DEBUG] starting main thread
heisenbug: [DEBUG] Monitoring eventfd 25
heisenbug: [DEBUG] scheduler out of jobs, next timeout inf:
                     SQEs ready: 0
                     Operations active: heisenbug1
                     Sketch buffer: 0/64 (plus 0 old buffers)
32DEBUG] starting run
heisenbug: [DEBUG] starting run
heisenbug: [DEBUG] starting main thread
heisenbug: [DEBUG] Monitoring eventfd 29
heisenbug: [DEBUG] starting main thread
mDEBUG
heisenbugheisenbug: [DEBUGnf:
SQEs ready: 0
Operations active: 1
Sketch buffer: 0/64 (plus 0 old buffers)
heisenbug: [DEBUG] starting main thread

heisenbug: [DEBUG] Monitoring eventfd 28
heisenbug: [DEBUG] scheduler out of jobs, next timeout inf:
                     SQEs ready: 0
                     Operations active: 1
                     Sketch buffer: 0/64 (plus 0 old buffers)
heisenbug: [DEBUG] starting main thread
DEBUGheisenbug
heisenbug: [DEBUG] starting main thread
heisenbug: [DEBUG] heisenbug: [
eisenbug: [DEBUG] DEBUG] scheduler out of jobs, next timeout inf:
                              0
Operations active: 1
Sketch buffer: 0/64 (plus 0 old buffers)
] scheduler out of jobs, next timeout inf:scheduler out of jobs, next timeout SQEs ready: 0 (plus 0 old buffers)heisenbug: [WARNING] Uring.exit failed (Invalid_argument("exit: 1 request(s) still active!")) while handling another error
 starting run
heisenbug: [DEBUG0;32m] starting runFatal error: exception Stdlib.Queue.Empty
Raised at Stdlib__Queue.take in file "queue.ml", line 73, characters 11-22
Called from Stdlib__Format.advance_left in file "format.ml", line 436, characters 6-31
Called from Stdlib__Format.pp_flush_queue in file "format.ml", line 613, characters 2-20
Called from Stdlib__Format.pp_print_newline in file "format.ml" (inlined), line 671, characters 2-27
Called from Stdlib__Format.output_formatting_lit in file "format.ml", line 1340, characters 33-56
Called from Stdlib__Format.kfprintf.(fun) in file "format.ml", line 1421, characters 16-34
Called from Eio_linux.schedule in file "lib_eio_linux/eio_linux.ml", line 563, characters 8-238
Called from Eio_linux.run.(fun) in file "lib_eio_linux/eio_linux.ml", line 1512, characters 4-629
Called from Eio_linux.with_uring in file "lib_eio_linux/eio_linux.ml", line 1347, characters 10-18
Re-raised at Eio_linux.with_uring in file "lib_eio_linux/eio_linux.ml", line 1355, characters 6-41
Called from Dune__exe__Heisenbug in file "deku-p/src/core/bin/node/heisenbug.ml", line 1, characters 9-21

Calling Logs_threaded.enable () prevents this problem.

Perhaps it's expected that users will call Logs_threaded.enable () if they need it. Would calling Logs_threaded.enable when you use the Eio domain manager be a good default?

talex5 commented 1 year ago

Possibly Logs could use Domain.before_first_spawn to set this up itself (since it will affect all programs using Logs and domains, not just Eio).

I'm thinking of removing the Logs dependency from Eio (it's fine, but we're not using it much and fewer dependencies is better).

patricoferris commented 1 year ago

Just bumped into this :/ -- was it fixed in https://github.com/ocaml-multicore/eio/pull/403 ?

talex5 commented 1 year ago

No, but since Eio no longer produces much log output, it's less likely people will blame it on us. I think this should be fixed in Logs.