Open brucehoult opened 5 years ago
One of the design goals of metal is to have as thin of an interface as possible, which results in us not really having any trap handling set up by default -- essentially just an infinite loop. I wouldn't be opposed to doing something simple like defaulting to a trap handler that says something along the lines of "you didn't register a trap handler, but you've taken a trap anyway" to aid debugging.
Maybe the right thing to do here is to have a debug configuration option for metal, which will do things like printing prettier error messages at the cost of some code size (ie, always linking in printf)?
Some code size, yes, but printf() isn't necessary to print a few string literals and hex numbers. Even "Trap mepc=0xnnnnnnnnnnnnnnnn mcause=0xnnnnnnnnnnnnnnnn mtval=0xnnnnnnnnnnnnnnnn\n" would be helpful. That's probably under 100 bytes.
Bruce,
One can set up the traps vector prior to their own applications. Metal already has APIs that one can do this, if one need to print out mcause, mtval.
On Mar 10, 2019, at 11:25 PM, Bruce Hoult notifications@github.com wrote:
Some code size, yes, but printf() isn't necessary to print a few string literals and hex numbers. Even "Trap mepc=0xnnnnnnnnnnnnnnnn mcause=0xnnnnnnnnnnnnnnnn mtval=0xnnnnnnnnnnnnnnnn\n" would be helpful.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sifive/freedom-metal/issues/62#issuecomment-471418644, or mute the thread https://github.com/notifications/unsubscribe-auth/AiDa38IsR_pglVp1PzFarUeH8Ps7PDFoks5vVfbGgaJpZM4biXPX.
I'm sure one can. The issue is that programs were written to be compiled with default flags with the elf compiler and run using "spike pk", which results in sensible defaults and diagnostics, and then Palmer set up a build system that built them as metal apps, and then they suddenly mysteriously had no diagnostics. Also in the case of saxpy, it crashed with an "unknown syscall" as soon as -- it turned out -- a non-trivial printf() was executed (e.g. to print "%f == %f\n") because the metal stack size was not big enough.
These two things made debugging harder and rather mysterious for people not familiar with metal such as Megan and me.
Things should be set up so that casual users can write simple programs that Just Work. Having a very basic default trap handler automatically installed, and a default stack big enough to run printf() seem reasonable to me. If sophisticated users want to substitute their own trap handler or reduce the stack size that's up to them.
When I was debugging Krste's vector kernels it was nice that pk (I think) traps illegal instructions and prints the offending opcode and a dump of the x registers to stdout, and then nicely exits.
Once Palmer converted everything to use metal this no longer happened. Instead it simply loops.
This was a problem because it meant that when someone had not correctly updated their toolchain they just got a silent hang.
What happens with a standard elf build with newlib, run with spike pk:
And with metal:
or with -l, thousands and thousands of lines of output and then...
When the execution of spike is buried deep within scripts and makefiles it is neither obvious what has happened, nor obvious how to add the -l flag (which you don't otherwise want!) or even that you need it.
(this is a legal RVV instruction, but not yet implemented in spike, which therefore raises illegal instruction)