supercollider / supercollider

An audio server, programming language, and IDE for sound synthesis and algorithmic composition.
http://supercollider.github.io
GNU General Public License v3.0
5.45k stars 748 forks source link

[lang] throw Error with deep stack --> infloop --> swap --> reboot #4754

Open jamshark70 opened 4 years ago

jamshark70 commented 4 years ago

Environment

Steps to reproduce

For a stack overflow, it's hard to reduce the code below a certain point, but... Quarks.install("ddwChucklib-livecode") (actually, you need to check out the topic/dropSourceArg branch by hand), recompile, and then:

(
BP(\y).free;
PR(\abstractLiveCode).chuck(BP(\y), nil, (
    event: (eventKey: \default),
    defaultParm: \degree,
    parmMap: (degree: (isPitch: true))
));
)

/y = "\ins("*", 8, 0.5)::\xrand("\seq("123")\seq("7854")\seq("26,\xrand2..5("34567")::\xpose("1'")::\artic(".")")")";

And be ready to hit Language -> Quit interpreter immediately. On my system, I have about half a second to stop it before it goes into swap space and the system becomes completely unresponsive to mouse and keyboard. (I've had to hard-shutdown and reboot about five times this afternoon while troubleshooting this.)

Expected vs. actual behavior

In chucklib-livecode, \xrand2..5("34567") is a syntax error.

Expected behavior: The cll parser throws an Error and the error string gets posted. https://github.com/jamshark70/ddwChucklib-livecode/blob/topic/dropSourceArg/parsenodes.scd#L597

Actual behavior: throw causes the interpreter to go into an infinite loop and chew up swap space until the machine is dead.

User impact, incidentally, is that now I know certain syntax errors in my live-coding system may actually take down the entire machine, which would be catastrophic on stage. I can avoid this hang by not committing that syntax error -- but the system needs to be robust enough to handle errors without killing an entire show.

The problem does depend on stack depth. If I test by invoking the parser object directly (which bypasses several stack levels), then the throw does back all the way out. It takes longer than it should (about half a second), which is a sign of some sort of memory management problem, but it doesn't hang.

Also, if I put the same syntax error in a simpler statement, no hang.

But, I tried changing PyrKernel.h #define EVALSTACKDEPTH 4096 (was 512) -- it didn't make any difference.

I just tried it in gdb and reproduced the non-response. For safety, I was ready with a killall sclang in another terminal. That gave me a stack trace leading back to our old friend DebugFrameConstructor. I've seen hanging or crashing problems before in this area. (If I comment out protectedBacktrace = this.getBackTrace.caller; in Error.sc, then no hang -- so it's definitely getBackTrace.)

I'm willing to run whatever other checks are needed using gdb, but I would need some guidance about where to set breakpoints or such.

Thread 1 "sclang" received signal SIGTERM, Terminated.
0x00005555555e5a1a in DebugFrameConstructor::fillDebugFrame (
    outSlot=<optimized out>, frame=0x55555d6c81e8, 
    g=0x555555b17aa0 <gVMGlobals>, this=0x7fffffffd2d0)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2207
2207                WorkQueueItem newWork = std::make_pair(slotRawFrame(&frame->caller), debugFrameObj->slots + 3);
(gdb) where
#0  0x00005555555e5a1a in DebugFrameConstructor::fillDebugFrame(VMGlobals*, PyrFrame*, pyrslot*) (outSlot=<optimized out>, frame=0x55555d6c81e8, g=0x555555b17aa0 <gVMGlobals>, this=0x7fffffffd2d0)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2207
#1  0x00005555555e5a1a in DebugFrameConstructor::run_queue(VMGlobals*) (g=0x555555b17aa0 <gVMGlobals>, this=0x7fffffffd2d0)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2170
#2  0x00005555555e5a1a in DebugFrameConstructor::makeDebugFrame(VMGlobals*, PyrFrame*, pyrslot*) (outSlot=<optimized out>, frame=<optimized out>, g=0x555555b17aa0 <gVMGlobals>, this=0x7fffffffd2d0)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2162
#3  0x00005555555e5a1a in MakeDebugFrame (outSlot=<optimized out>, frame=<optimized out>, g=0x555555b17aa0 <gVMGlobals>)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2228
#4  0x00005555555e5a1a in prGetBackTrace(VMGlobals*, int) (g=0x555555b17aa0 <gVMGlobals>, numArgsPushed=<optimized out>)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:2236
#5  0x00005555555e6746 in doPrimitive(VMGlobals*, PyrMethod*, int) (g=0x555555b17aa0 <gVMGlobals>, meth=0x5555573a08c0, numArgsPushed=<optimized out>)
    at /home/dlm/share/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:3871
#6  0x00005555555bd5a2 in Interpret(VMGlobals*) (g=0x555557a26390, 
    g@entry=0x555555b17aa0 <gVMGlobals>)
    at /home/dlm/share/supercollider/lang/LangSource/PyrInterpreter3.cpp:3034
---Type <return> to continue, or q <return> to quit---
#7  0x0000555555681058 in runInterpreter(VMGlobals*, PyrSymbol*, int) (g=g@entry=0x555555b17aa0 <gVMGlobals>, selector=selector@entry=0x555555ce5948, numArgsPushed=numArgsPushed@entry=1)
    at /home/dlm/share/supercollider/lang/LangSource/PyrInterpreter3.cpp:127
#8  0x00005555556857f7 in runLibrary(PyrSymbol*) (selector=selector@entry=0x555555ce5948) at /home/dlm/share/supercollider/lang/LangSource/PyrLexer.cpp:2269
#9  0x00005555556bb99e in SC_LanguageClient::runLibrary(PyrSymbol*) (this=<optimized out>, symbol=0x555555ce5948)
    at /home/dlm/share/supercollider/lang/LangSource/SC_LanguageClient.cpp:161
#10 0x00005555555c7377 in SC_TerminalClient::interpretCmdLine(char const*, unsigned long, bool) (this=0x555555bedb90, cmdLine=<optimized out>, size=<optimized out>, silent=<optimized out>)
    at /home/dlm/share/supercollider/lang/LangSource/SC_TerminalClient.cpp:310
#11 0x00005555555c9680 in SC_TerminalClient::interpretInput() (this=0x555555bedb90) at /home/dlm/share/supercollider/lang/LangSource/SC_TerminalClient.cpp:325
#12 0x00005555555cc74c in boost::_mfi::mf0<void, SC_TerminalClient>::operator()(SC_TerminalClient*) const (p=0x555555bedb90, this=<optimized out>)
    at /home/dlm/share/supercollider/external_libraries/boost/boost/bind/mem_fn_template.hpp:49
#13 0x00005555555cc74c in boost::_bi::list1<boost::_bi::value<SC_TerminalClient*> >::operator()<boost::_mfi::mf0<void, SC_TerminalClient>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, SC_TerminalClient>&, boost::_bi::list0&, int) (a=<synthetic pointer>..., f=<synthetic pointer>..., this=<synthetic
jamshark70 commented 4 years ago

Update on the user impact: I'm rewriting the cll parser objects as true classes instead of Protos, which will reduce the stack size a lot (because a method call in a Proto goes through doesNotUnderstand, use, protect and the function -- rewriting as classes collapses all that into one stack frame). Confirmed that this does get the right behavior for that syntax error (error is posted immediately, not even the delay I had noted). That makes the issue less critical (but any sort of fatal hang should be taken seriously -- and I've only pushed the nasty territory back, unknown where the new boundary is).

dyfer commented 4 years ago

@jamshark70 this is interesting (and also above my head). One thing that might be useful for testing - I think you can specify a commit SHA in quarks install: Quarks.install("ddwChucklib-livecode", "shaOfTheCommitOnTheDesiredBranch")