pulp-platform / ara

The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
Other
371 stars 132 forks source link

vfmv.f.s and vmv.x.s are supported? #145

Closed Gitefu closed 2 months ago

Gitefu commented 2 years ago

A simple try-out of vfmv.f.s and vmv.x.s causes the simulation to error out. Does ara support vfmv.f.s and vmv.x.s?

Thanks.

mp-17 commented 2 years ago

Hello @Gitefu ,

Yes, they should be supported. Can you please share the program that causes this error?

Best, Matteo

Gitefu commented 2 years ago

Hello @mp-17

The following program does not work.

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>

#include "printf.h"
#include "runtime.h"

#define N 32
void init_array(float *array, int64_t n){
  for(int i=0; i<n; i++)
    array[i] = i+0.1;
}
int main(){
  /* ---init variable numbers--- */
  float array[N];
  init_array(array,N);
  int64_t n=N;
  float f=0.8;

  printf("before moving:array[0]=%f,f=%f\n",array[0],f);

  /* --- f = array[0] --- */
  asm volatile("vsetvli zero, %0, e32, m1, ta, ma" ::"r"(n));
  printf("\tvsetvli is done.\n");

  asm volatile("vle32.v v1, (%0)" ::"r"(array));
  printf("\tvle32.v is done.\n");

  asm volatile("vfmv.f.s f1, v1"); // error
  printf("\tvfmv.f.s is done.\n");

  asm volatile("fsw f1, (%0)" ::"r"(&f));
  printf("\tfsw.f.s is done.\n");

  /* ----- */
  printf("after moving:array[0]=%f,f=%f\n",array[0],f);
  return 0;
}

The result is as follows.

build/verilator/Vara_tb_verilator  -l ram,/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1,elf
Program header number 0 in `/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1' low is 80000000
Program header number 0 in `/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1' high is 800015d1
Program header number 1 in `/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1' high is 80001ab7
Program header number 2 in `/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1' high is 80001ae7
Program header number 3 in `/home/gitefu/Documents/gitRepositories/ara/apps/bin/original1' is not of type PT_LOAD; ignoring.
Set `ram TOP.ara_tb_verilator.dut.i_ara_soc.i_dram 10 0x80000000 0x80000 write with offset: 0x0 write with size: 0x1ae8
Simulation of Ara
=================

Simulation running, end by pressing CTRL-c.
before moving:array[0]=0.100000,f=0.800000
        vsetvli is done.
        vle32.v is done.
[9632] %Warning: ara_tb_verilator.sv:45: Assertion failed in TOP.ara_tb_verilator: Core Test *** FAILED *** (tohost = 2)
- /home/gitefu/Documents/gitRepositories/ara/hardware/tb/ara_tb_verilator.sv:50: Verilog $finish
Received $finish() from Verilog, shutting down simulation.

Simulation statistics
=====================
Executed cycles:  12d0
Wallclock time:   1.058 s
Simulation speed: 4551.98 cycles/s (4.55198 kHz)
make: *** [Makefile:186: simv] Error 2

Gitefu

mp-17 commented 2 years ago

Hello @Gitefu,

I tried the program and it works with both QuestaSim and Verilator (I compiled Ara with 4 Lanes). I use Verilator 4.214 2021-10-17 rev v4.214.

To see your version, execute ./install/verilator/bin/verilator --version

Is it the same version that I am using? Are you on the main branch, and is it up to date?

Best, Matteo

giuseppe-chiari commented 1 year ago

I am experiencing a similar problem with vmv.x.s.

I compiled dotproduct for the Ideal Dispatcher with the 8 lanes configuration, specifically: make config=8_lanes bin/dotproduct.ideal

Then, I set string binary to ~/path/bin/dotproduct.ideal in ara_tb.sv and string vtrace to ~/path/ideal_dispatcher/dotproduct.vtrace in accel_ideal_dispatcher.sv.

Simulation (on Vivado 2020.2) seems to work until a VMVXS instruction is dispatched. At that point simulation stalls indefinitely, meaning that time advances but the instruction is never fulfilled.

The following is the complete instruction (as per ara_req_o in ara_sequencer.sv) that causes Ara to stall:

1,VMVXS,1,EW8,VFU_None,0,00,1,OpQueueConversionNone,EW64,10,1,OpQueueConversionNone,EW64,0,EW64,0000002000000000,0,0,0000002000000000,0a,0,LMUL_1,RNE,0,CVT_SAME,0001,0000,'{0,0,0,EW64,LMUL_8},00,00,00,00

Best regards, Giuseppe

LeleJun97 commented 1 year ago

I also encountered the same problem as you, the vmv.x.s command couldn't run and got stuck as soon as it was executed.

mp-17 commented 1 year ago

Hello @LeleJun97,

Can you please post a small portion of the code to reproduce the problem?

Thanks, Matteo

LeleJun97 commented 1 year ago

https://github.com/pulp-platform/ara/issues/241#issue-1865247746 I have raised my issue here

978716899 commented 1 year ago

Yes, like you, I also encountered this problem. I wrote the LeNet-5 program in C language, and when gcc -O3 automatic vectorization was enabled, the vmv.x.s instruction would get stuck.

https://github.com/pulp-platform/ara/issues/241#issuecomment-1696689844

mp-17 commented 2 months ago

Let me know if you are still experiencing this after this fix.