mirage / mirage-xen

Xen core platform libraries for MirageOS
ISC License
18 stars 20 forks source link

Retain exception info (continued from old repo) #5

Open cfcs opened 6 years ago

cfcs commented 6 years ago

Background here: https://github.com/mirage/mirage-platform/issues/193

The source code in question: https://github.com/mirage/mirage-xen/blob/master/lib/main.ml#L58-L60 (notice the err function at https://github.com/mirage/mirage-xen/blob/master/lib/main.ml#L50-L52 )

So, tried again with lwt 3.3.0 on 4.04.2+fPIC:

2018-03-16 21:32:18 -00:00 INF [net-xen:frontend] connect 0
2018-03-16 21:32:18 -00:00: ERR [application] main: Xs_protocol.Enoent("read")
Raised at file "src/core/lwt.ml", line 3008, characters 20-29
Called from a file "lib/main.ml", line 60, characters 15-25

It appears to me that the Lwt.poll somehow gets rid of the original backtrace.

ping @hannesm @aantron

aantron commented 6 years ago

@gabelevi, could you comment on this?

@cfcs this was recently worked on by @gabelevi in https://github.com/ocsigen/lwt/pull/556, however I believe the reraise is going to "work," in the sense of preserving nice backtraces, in only certain contexts (during execution of a handler?), and @gabelevi might be more immediately prepared to say what those are. Otherwise, I will look into it more deeply.

cfcs commented 6 years ago

@aantron Thanks for taking the time to look into this! Let me know if I can do something to help debugging.

aantron commented 6 years ago

@cfcs For good stack traces, the program needs at least to:

The truncated stack trace here, ending in Lwt.poll, makes me suspect that the promise was rejected with Lwt.fail, and there is no earlier stack trace to report.

There may be additional conditions, but these are the ones I found so far.

cfcs commented 6 years ago

Ah, thank you! The Lwt.fail lead seems promising!

I think the original crash in this case originates from xenstore (based on the error message immediately prior to the crash): https://github.com/mirage/ocaml-xenstore/blob/master/core/xs_protocol.ml#L671 - my current hypothesis is now that it could be this function https://github.com/mirage/ocaml-xenstore/blob/master/client_lwt/xs_client_lwt.ml#L62-L70 that is causing the backtrace to be forgotten.