analogdevicesinc / ai8x-synthesis

Quantization and Synthesis (Device Specific Code Generation) for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
55 stars 47 forks source link

RISC-V GNU Toolchain #346

Open eshankn opened 1 week ago

eshankn commented 1 week ago

Hello!

After using the synthesis tool for generating the C code for the RISC-V processor, there are make build errors while further flashing the code on the MAX78000EVKIT board using VS Code.

make[1]: riscv-none-elf-gcc: No such file or directory
make[1]: *** [/C/MaximSDK//Libraries/CMSIS/Device/Maxim/GCC/gcc_riscv.mk:382: /c/Users/software/vs-code-workspace/mnist-riscv-ex/buildrv/cnn.o] Error 127
make[1]: Leaving directory '/c/Users/software/vs-code-workspace/mnist-riscv-ex'
make: *** [Makefile:66: riscv] Error 2 

In the generated VS Code settings.json file, xPack_GCC_path is set to the now deprecated riscv-none-embed-gcc. While the gcc_riscv.mk file uses riscv-none-elf-gcc as the default RISC-V toolchain, which potentially causes the error.

I can see that the argument --riscv defaults to the value of riscv-none-embed-gcc. Changing this argument setting does not change the behaviour either, as this value is only used for the eclipse settings but not for the VS Code settings.json file. The parameters for the settings.json file are taken from the template wherein xPack_GCC_path is set to use riscv-none-embed-gcc and its corresponding version.

MSDK installation itself provides two RISC-V toolchains wherein riscv-none-embed-gcc is now deprecated. It would make sense to remove this toolchain and update the files consistently to use riscv-none-elf-gcc instead to avoid errors. Otherwise, it would also be useful to have the flexibility to choose the toolchain where more than one option is available through an argument option for the synthesis tool.

Steps to reproduce the error:

  1. Generate the C code using the synthesis tool from the repository
    python ai8xize.py --test-dir demos --prefix mnist-riscv-ex --checkpoint-file trained/ai85-mnist-qat8-q.pth.tar --config-file networks/mnist-chw-ai85.yaml --softmax --device MAX78000 --timer 0 --display-checkpoint --verbose --riscv --riscv-debug
  2. Build the corresponding C code in VS Code

Another quick question, the settings for VS Code require each path to be set specifically and is quite clear. However, for the eclipse settings, there is no change in behaviour for either of the riscv-none-embed-gcc, riscv-none-elf-gcc, and arm-none-eabi-gcc GCC_PREFIX values. The code always compiles without any errors. Why does this happen?

Jake-Carter commented 1 week ago

Thanks @eshankn, you're right. I have fallen a little behind in the maintenance of the --riscv option for the project generators.

I just need to update the repo with the latest VSCode-Maxim templates. I'll have a PR open asap.

Jake-Carter commented 1 week ago

@eshankn I've just opened a PR that should fix the issue.

To answer your question on the Eclipse environment - Eclipse sources the setenv script on startup. You can find it in the root directory of the MSDK install. The script sets up the PATH to add the latest riscv-none-elf.

The build system itself defaults to calling riscv-none-elf. However, there is an option for changing it. See RISCV_PREFIX under Build Variables for RISC-V Cores in the MSDK UG. If you'd like to use the deprecated toolchain I would recommend this option instead. Note that whatever other option you specify must be on the PATH, and the latest project files set up the environment for riscv-none-elf only. We've kept around the deprecated toolchain mainly to avoid breaking legacy installs and projects.

347

eshankn commented 6 days ago

@Jake-Carter thank you for the prompt fix and explanation!

Please correct me if I am wrong in my understanding. The generated Eclipse environment settings have the GCC_PREFIX value set to riscv-none-embed-. The synthesis tool defaults the --riscv argument to use riscv-none-embed-. The mentioned RISCV_PREFIX value is set to riscv-none-elf in the makefile which overrides the GCC toolchain prefix. Is that correct and the defined behaviour?

When trying to build the above-generated C code in Eclipse, I get this unresolved symbol error. Screenshot (1)

This error is, however not present in the similar statement in the provided mnist-riscv example. Rebuilding the index does not help either. In VS Code, the generated code does build successfully but is unable to further flash onto the board.

Jake-Carter commented 5 days ago

@eshankn the --riscv option tells the synthesis tool to generate code for the RISC-V core, and generates a set of Makefiles for generating a combined .elf file for both cores. RISCV_PREFIX is left at its default, which is now riscv-none-elf, and overrides the toolchain prefix. So yes, that's correct.

In general I need to look into our Eclipse settings again. It looks like they've become a bit outdated too. That intellisense error is probably just a missing search path.

I can replicate the flashing issue you mentioned. I'm getting a hard-fault. Is it the same for you?

The target architecture is set to "armv7e-m".
Open On-Chip Debugger (Analog Devices 0.12.0-1.0.0-7)  OpenOCD 0.12.0 (2023-09-27-07:53)
Licensed under GNU GPL v2
Report bugs to <processor.tools.support@analog.com>
0x00002124 in ?? ()
Loading section .sec1, size 0x9fd8 lma 0x10000000
Loading section .sec2, size 0x6000 lma 0x1000a000
Loading section .sec3, size 0x10000 lma 0x10010000
Loading section .sec4, size 0x308 lma 0x10020000
Start address 0x1000ae20, load size 131808
Transfer rate: 33 KB/sec, 11982 bytes/write.
Section .sec1, range 0x10000000 -- 0x10009fd8: matched.
Section .sec2, range 0x1000a000 -- 0x10010000: matched.
Section .sec3, range 0x10010000 -- 0x10020000: matched.
Section .sec4, range 0x10020000 -- 0x10020308: matched.
[max32xxx.cpu] clearing lockup after double fault
[max32xxx.cpu] halted due to debug-request, current mode: Handler HardFault
xPSR: 0x80000003 pc: 0xe7fee7fe msp: 0x20003fd0
Polling target max32xxx.cpu failed, trying to reexamine
[max32xxx.cpu] Cortex-M4 r0p1 processor detected
[max32xxx.cpu] target has 6 breakpoints, 4 watchpoints
none
[Inferior 1 (Remote target) detached]

It seems the issue is caused by the Makefiles. They use an older method for linking the combined ARM+RISC-V binary that relies on some sed calls and hard-coded address values. I haven't found exactly where yet, but this method is causing the hard-fault.

I can resolve the issue by using the latest Makefiles and the latest RISC-V loader I developed (see RV_ARM_Loader). This new loader dynamically offsets the RISC-V code based on the size of the ARM code. Attached is the updated project. Note the "mnist-riscv-ex" project inside the top-level "RV_ARM_Loader" project, and the contents of project.mk:

# Enable the RISC-V loader
RISCV_LOAD = 1

# Set the RISCV application to load
RISCV_APP=./mnist-riscv-ex

RV_ARM_Loader_mnist.zip

I've started work on another PR to get the Makefiles updated

eshankn commented 4 days ago

@Jake-Carter thank you for the update and files!

The flash issue was related to the flash task itself failing. However, I am now unable to reproduce it. I now receive the HardFault but only during the flash and run task and not during the flash task.