Severson-Group / AMDC-Firmware

Embedded system code (C and Verilog) which runs the AMDC Hardware
http://docs.amdc.dev/firmware
BSD 3-Clause "New" or "Revised" License
30 stars 5 forks source link

Timing analysis b/w Verilog and C adders #308

Closed annikaolson closed 8 months ago

annikaolson commented 1 year ago

The purpose of this PR is to compare the timing between the implementations of an adder in terms of a Verilog IP core and C code; made updates to the Rev E block design by adding an IP core and made a user app to test the adder in the FPGA versus the CPU.

C Code and Verilog Changes

Added an adder app/command to compute the average time per operation of the C code version of an addition function versus the Verilog implementation.

A close-up of the addition operation being performed in this test:

// C code variant
out = 8*in1 + in2/4 - 10203;
// Verilog variant
out <= (in1 << 3) + (in2 >> 2) - 32'd10203;

The actual operation is the same, but the memory operations and sequential logic in the FPGA produce different results from the C operation.

Command Added

The arguments the command took were N (number of operations), and two inputs; there were previously no commands to test timing in the FPGA versus the C code:

Results

Quantified Results

N was changed each time, and both inputs were kept consistent at 0 for each test. The command was run and the results show the average time per operation to compute the sum:

N (Operations) CPU Time (ns) FPGA Time (ns)
1 15.00 7.500
2 15.00 7.500
4 13.50 193.8
10 12.90 299.4
50 12.21 344.1
100 12.15 394.4
200 12.04 396.3
500 12.02 N/A*

*There was an error in the console and the debugging session was suspended given this argument, so no data could be retrieved while N = 500 for the FPGA.

Code Analysis

The Verilog testing code had three operations in it:

// Compute result using FPGA
base_addr[0] = in1;
base_addr[1] = in2;
out = base_addr[2];

This code would act as an accelerator, then, if the C code took ~400 nanoseconds or longer, or in general using a more complex operation. However, if it is less, the Verilog code is slower.

npetersen2 commented 12 months ago

Note: this PR has a merge conflict with hw/amdc_reve.bd---the FPGA block design file. This file must be edited by hand in Vivado, so git cannot auto-merge.

I created this issue for @annikaolson since I merged #307 on top of her PR, which created the issue. Last week, we had this same issue, so I helped @annikaolson go through and fix this merge conflict with the bd file. It is very painful since you have to do it by hand. All that said, since she has already gone through it, let's accept the merge conflict on this PR, since this PR is a test PR anyway (will not be merged).

npetersen2 commented 8 months ago

Closing this PR since it is a practice exercise for new contributor onboarding.