janet-lang / janet

A dynamic language and bytecode vm
https://janet-lang.org
MIT License
3.38k stars 217 forks source link

Janet erases stacktrace for an error while it is dying.. #1380

Closed amano-kenji closed 5 months ago

amano-kenji commented 5 months ago

When a janet script exits, (file/read stdin :line) returns nil. Thus,

(json/decode (file/read stdin :line))

causes

error: bad slot #0, expected string, symbol, keyword or buffer, got nil

without any stacktrace because janet is busy killing itself. It doesn't have stacktrace anymore.

It took hours to figure this out. What's the solution?

bakpakin commented 5 months ago

The description of the problem doesn't make sense to me, I would need some sample code to debug. It sounds like there is some multithreading involved.

amano-kenji commented 5 months ago

Yes, there are a lot of threads that stall the shutdown process.... Expect at least 7 threads.

file/read is called in a separate thread, too.

amano-kenji commented 5 months ago

I think file/read didn't have time to return nil if the shutdown was fast.

If there are many threads that stall the shutdown, then file/read has enough time to return nil which causes json/decode to throw an error because json/decode doesn't like nil.

Because janet is shutting down, it doesn't have stacktrace for json/decode.

It is basically a race condition.

bakpakin commented 5 months ago

If you apply the link mentioned here : https://github.com/janet-lang/janet/issues/1378#issuecomment-1918257564, does that fix the issue?

Whatever the issue is, I don't think the stacktrace is being "erased" as is suspected.

amano-kenji commented 5 months ago

I suspect janet shut down before stacktrace was printed?

Should I mechanically replace all invocations of exit with _exit and compile janet?

amano-kenji commented 5 months ago

Is it possible for file/read to return nil during shut down?

bakpakin commented 5 months ago

I suspect janet shut down before stacktrace was printed?

Should I mechanically replace all invocations of exit with _exit and compile janet?

Yeah, that is what I'm suggesting. After _exit called, nothing should be returning, as that is the root of all of this strange cleanup behavior

amano-kenji commented 5 months ago

I was able to reproduce this issue partially.

test.janet

(import spork/json)

(ev/spawn-thread
  (with [proc (os/spawn ["sleep" "100000000000"] :p {:out :pipe :err :pipe})]
    (os/proc-wait proc)))

(ev/spawn-thread
  (forever
    (json/decode (file/read stdin :line))))

(forever
  (ev/sleep 1))

Execute

$ sleep 1000 | janet test.janet

If you kill sleep, you will get an error message like

$ sleep 10000 | janet test.janet
error: bad slot #0, expected string, symbol, keyword or buffer, got nil
  in json/decode
  in _spawn-thread [test.janet] on line 9, column 5

So, I should check whether (file/read stdin :line) returns nil anyway.

My suspicion is that if you kill sleep and then test.janet very fast in sequence, you can kill the script before stacktrace is printed. That's probably why I could sometimes see the stacktrace and could not most of the time.

What do you think?

amano-kenji commented 5 months ago

I think I'm just closing this issue.... for now...