google / CFU-Playground

Want a faster ML processor? Do it yourself! -- A framework for playing with custom opcodes to accelerate TensorFlow Lite for Microcontrollers (TFLM). . . . . . Online tutorial: https://google.github.io/CFU-Playground/ For reference docs, see the link below.
http://cfu-playground.rtfd.io/
Apache License 2.0
468 stars 119 forks source link

Certain VexRiscv configurations hanging #656

Closed ShvetankPrakash closed 2 years ago

ShvetankPrakash commented 2 years ago

There are certain configurations of the CPU that are hanging (both on the board and in simulation). For example if you set all available CPU configurable parameters to False/Zero and then only turn the bypass on (i.e. bypass=True) attempting to run the program on the board hangs. Also same happens with safe=True and all other params set to False/Zero as well as iCacheSize!=0 and all other params set to False/Zero.

tcal-x commented 2 years ago

This was due to a d-cache flush instruction in the BIOS. When there isn't a d-cache, this instruction causes a hang.

The d-cache flush instruction is VexRiscv-specific, 0x0000500f, see here: https://github.com/enjoy-digital/litex/blob/master/litex/soc/cores/cpu/vexriscv/system.h#L26-L31

I had looked in the wrong place; I'd looked in the CFU Playground binary (<projdir>/build/software.elf.dis), but I should have looked in the BIOS ($CFUROOT/soc/build/..../software/bios/bios.elf).

Since I didn't look in the right place at first, I tried to use Verilator simulation to see where the VexRiscv PC was at the point of the hang. Since it hung before getting to the CFU Playground menu, I couldn't turn the trace capture on using the menu option. So I had to hack the BIOS, which I'll repeat here in case I need it again in the future.

I had seen the "Liftoff!" message from LiteX BIOS, so I knew that it got that far, so I added this line to activate trace capture:

*((int *)0xf0000000L) = 1;

at this point in the BIOS code:

https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/boot.c#L54

which is located at $CFUROOT/third_party/python/litex/litex/soc/software/bios/boot.c in the CFU Playground installation.

The fix

When a VexRiscv configuration has no d-cache, we need to tell LiteX internals so that the 0x0000500f instruction is not generated. A similar case happens with the Fomu variant, so we can copy that Python code:

https://github.com/google/CFU-Playground/blob/main/soc/patch_cpu_variant.py#L87-L89