google / CFU-Playground

Want a faster ML processor? Do it yourself! -- A framework for playing with custom opcodes to accelerate TensorFlow Lite for Microcontrollers (TFLM). . . . . . Online tutorial: https://google.github.io/CFU-Playground/ For reference docs, see the link below.
http://cfu-playground.rtfd.io/
Apache License 2.0
470 stars 120 forks source link

"make load" step got stuck #775

Open Siris-Li opened 1 year ago

Siris-Li commented 1 year ago

I designed my own CFU to accelate FFT. However, after make prog step was finished successfully, my terminal got stuck somewhere in make load.

make[3]: Leaving directory '/home/limx/CFU-Playground/proj/fft/build'
Running interactively on FPGA Board
make -C /home/limx/CFU-Playground/soc -f /home/limx/CFU-Playground/soc/common_soc.mk load_hook
make[3]: Entering directory '/home/limx/CFU-Playground/soc'
MAKEFLAGS=-j8 /home/limx/CFU-Playground/scripts/pyrun ./common_soc.py --output-dir build/digilent_arty.fft --csr-json build/digilent_arty.fft/csr.json --cpu-cfu  /home/limx/CFU-Playground/proj/fft/cfu.v --uart-baudrate 1843200 --target digilent_arty "--cpu-variant=perf+cfu" --toolchain symbiflow --software-load --software-path /home/limx/CFU-Playground/proj/fft/build/software.bin
INFO:Workflow:Setting sys_clk_freq to 75MHz.
make[3]: Leaving directory '/home/limx/CFU-Playground/soc'
/home/limx/CFU-Playground/soc/bin/litex_term --speed 1843200  --kernel /home/limx/CFU-Playground/proj/fft/build/software.bin /dev/ttyUSB2

It won't continue to load everything on my FPGA. What I know is that the problem may rise from cfu.v, because if I change this file to another easy and 100% correct one, the problem disappeared. But I cannot figure out what's wrong with my cfu.v.

module Cfu (
  input               cmd_valid,
  output              cmd_ready,
  input      [9:0]    cmd_payload_function_id,
  input      [31:0]   cmd_payload_inputs_0,
  input      [31:0]   cmd_payload_inputs_1,
  output reg          rsp_valid,
  input               rsp_ready,
  output reg [31:0]   rsp_payload_outputs_0,
  input               reset,
  input               clk
);

  wire [31:0] cfu0;
  wire signed [15:0] degree0, degree1, degree2;
  wire neg;

  assign degree0 = ($signed(cmd_payload_inputs_0[15:0]) < 0)
                 ? (16'd180 - $signed(cmd_payload_inputs_0[15:0]))
                 : $signed(cmd_payload_inputs_0[15:0]);
  assign degree1 = degree0 % 16'd180;
  assign degree2 = (degree1 > 16'd90) ? (16'd180 - degree1) : degree1;
  assign neg     = cmd_payload_inputs_1[0];
  assign cfu0    = (neg << 16) + degree2;

  // Only not ready for a command when we have a response.
  assign cmd_ready = ~rsp_valid;

  always @(posedge clk) begin
    if (reset) begin
      rsp_payload_outputs_0 <= 32'b0;
      rsp_valid <= 1'b0;
    end else if (rsp_valid) begin
      // Waiting to hand off response to CPU.
      rsp_valid <= ~rsp_ready;
    end else if (cmd_valid) begin
      rsp_valid <= 1'b1;

      rsp_payload_outputs_0 <= (cmd_payload_function_id == 10'b0)
          ? cfu0
          : 32'b0;
    end
  end
endmodule

Thanks in advance! ( ̄︶ ̄)

Siris-Li commented 1 year ago

I find out that if I comment this line assign cfu0 = (neg << 16) + degree2; or change it to assign cfu0 = 32'b0;, it works. Weird! Why?

Siris-Li commented 1 year ago

Well, I've changed my cfu.v and got this in my terminal.

make[3]: Leaving directory '/home/limx/CFU-Playground/soc'
/home/limx/CFU-Playground/soc/bin/litex_term --speed 1843200  --kernel /home/limx/CFU-Playground/proj/fft/build/software.bin /dev/ttyUSB1

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2022 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Feb  5 2023 22:07:56
 BIOS CRC failed (expected %x, got 0x4e20f513)
 The system will continue, but expect problems.

Then, it will print messy code.

tcal-x commented 1 year ago

Hi @limingxuan-pku ; hmm, these symptoms point to some part of the circuit not meeting timing and therefore corrupting data. It looks like you are using Arty A7 board and the SymbiFlow / F4PGA toolchain, am I correct?

Siris-Li commented 1 year ago

Hi @limingxuan-pku ; hmm, these symptoms point to some part of the circuit not meeting timing and therefore corrupting data. It looks like you are using Arty A7 board and the SymbiFlow / F4PGA toolchain, am I correct?

Yes, I use Arty A7 and SymbiFlow, working environment is Virtual Machine Ubuntu 20.04. So could you please teach me how to recognize and avoid such timing issues?

tcal-x commented 1 year ago

Hi @limingxuan-pku ---

The files generated by SymbiFlow/F4PGA will be in $CFU_PLAYGROUND/soc/build/digilent_arty.<projname>/gateware/. In there, you can look at report* files, for example report_timing.setup.rpt.

But now that I'm looking at the files, I can't figure out where the requested 75MHz clock (from LiteX) is being passed to the F4PGA tools. Specifically, the .sdc file is empty, and I don't see anything like a create_clock command in the .xdc file.

@kgugala who can I ask for more info here?