alexmr09 / Mixed-precision-Neural-Networks-on-RISC-V-Cores

13 stars 2 forks source link

How to run baseline model? #1

Closed hypertseng closed 1 month ago

hypertseng commented 2 months ago

I have successfully replicated the results of the demo_system, but I also want to run the mixed-precision inference of the lenet5_mnist example using the unmodified Ibex and compare the acceleration effects. How should I proceed?

image
alexmr09 commented 2 months ago

You need to re-run the simulator, but this time, make sure to load the executable .elf file located in the lenet5_mnist/original subfolder.

The following command should work: ./extended_ibex/ibex_demo_system/build/lowrisc_ibex_demo_system_0/sim-verilator/Vibex_demo_system \ --meminit=ram,./inference_codes/lenet5_mnist/original/lenet5_mnist.elf

hypertseng commented 2 months ago

You need to re-run the simulator, but this time, make sure to load the executable .elf file located in the lenet5_mnist/original subfolder.

The following command should work: ./extended_ibex/ibex_demo_system/build/lowrisc_ibex_demo_system_0/sim-verilator/Vibex_demo_system --meminit=ram,./inference_codes/lenet5_mnist/original/lenet5_mnist.elf

Thank you for your response. I successfully ran the original test of the lenet5_mnist model. However, I’m wondering how I can determine the quantization mode and the exact configuration of the DNN model. It seems that the computation graph is manually built in C code. Do the modes of the layers in the network correspond to one of the Pareto points that you selected? If so, is this configuration adjustable? If yes, could you please explain how to set it up?

alexmr09 commented 2 months ago

Yes, the mode of each layer corresponds to the configuration that was selected based on the maximum allowable accuracy drop. You can also view the selected configuration/mode for each layer in the generated .c file, located in the optimized subfolder of the respective DNN model.

After the "optimal" solution has been found you can no longer adjust this configuration, as the weights are compressed in a certain way, in order to be mapped in the new component.

If you have a very specific configuration (of bit widths for the weights) in mind that you want to test, then you may need to manually create the quantized model (using the Brevitas library), specify the quantization parameters as you want, and then follow the execution of the mpq/common.py (start from the point, where the DSE ends).