Open jamshark70 opened 4 years ago
Update on the user impact: I'm rewriting the cll parser objects as true classes instead of Protos, which will reduce the stack size a lot (because a method call in a Proto goes through doesNotUnderstand
, use
, protect
and the function -- rewriting as classes collapses all that into one stack frame). Confirmed that this does get the right behavior for that syntax error (error is posted immediately, not even the delay I had noted). That makes the issue less critical (but any sort of fatal hang should be taken seriously -- and I've only pushed the nasty territory back, unknown where the new boundary is).
@jamshark70 this is interesting (and also above my head). One thing that might be useful for testing - I think you can specify a commit SHA in quarks install: Quarks.install("ddwChucklib-livecode", "shaOfTheCommitOnTheDesiredBranch")
Environment
Steps to reproduce
For a stack overflow, it's hard to reduce the code below a certain point, but...
Quarks.install("ddwChucklib-livecode")
(actually, you need to check out thetopic/dropSourceArg
branch by hand), recompile, and then:And be ready to hit Language -> Quit interpreter immediately. On my system, I have about half a second to stop it before it goes into swap space and the system becomes completely unresponsive to mouse and keyboard. (I've had to hard-shutdown and reboot about five times this afternoon while troubleshooting this.)
Expected vs. actual behavior
In chucklib-livecode,
\xrand2..5("34567")
is a syntax error.Expected behavior: The cll parser throws an Error and the error string gets posted. https://github.com/jamshark70/ddwChucklib-livecode/blob/topic/dropSourceArg/parsenodes.scd#L597
Actual behavior:
throw
causes the interpreter to go into an infinite loop and chew up swap space until the machine is dead.User impact, incidentally, is that now I know certain syntax errors in my live-coding system may actually take down the entire machine, which would be catastrophic on stage. I can avoid this hang by not committing that syntax error -- but the system needs to be robust enough to handle errors without killing an entire show.
The problem does depend on stack depth. If I test by invoking the parser object directly (which bypasses several stack levels), then the
throw
does back all the way out. It takes longer than it should (about half a second), which is a sign of some sort of memory management problem, but it doesn't hang.Also, if I put the same syntax error in a simpler statement, no hang.
But, I tried changing PyrKernel.h
#define EVALSTACKDEPTH 4096
(was 512) -- it didn't make any difference.I just tried it in gdb and reproduced the non-response. For safety, I was ready with a
killall sclang
in another terminal. That gave me a stack trace leading back to our old friendDebugFrameConstructor
. I've seen hanging or crashing problems before in this area. (If I comment outprotectedBacktrace = this.getBackTrace.caller;
in Error.sc, then no hang -- so it's definitelygetBackTrace
.)I'm willing to run whatever other checks are needed using gdb, but I would need some guidance about where to set breakpoints or such.