sld-columbia / esp

Embedded Scalable Platforms: Heterogeneous SoC architecture and IP integration made easy
Other
326 stars 105 forks source link

The result exist a quite strange error when using caches. (ibex core) #92

Closed AltiumHanChou closed 3 years ago

AltiumHanChou commented 3 years ago

Describe the bug

The result in "single-core mode" exist a quite strange error when using caches.

To Reproduce

  1. Setup environment variable.
  2. cd <ESP root>/socs/xilinx-vcu118-xcvu9p/
  3. make grlib-xconfig Save and close it.
  4. make esp-xconfig I just change core to "ibex", cache's implementation to "SystemC + HLS" and checked "Use Caches" box. B1
  5. make llc-hls && make l2-hls
  6. make sim
  7. run -a(In modelsim)

Expected behavior

To make sure the result is right or not, I modify the code systest.c like this

// Copyright (c) 2011-2021 Columbia University, System Level Design Group
// SPDX-License-Identifier: Apache-2.0

#include <stdio.h>

int main(int argc, char **argv)
{
  printf("Hello from ESP!\n");
  printf("Hello from ESP!\n");
  printf("Hello from ESP!\n");

  return 0;
}

So the correctly simulation result I get should be like this :

  ...
 # Hello from ESP!
 # Hello from ESP!
 # Hello from ESP!
  ...

But I got this :

  ...
 # lleHrf oE mo
 # !PSlleHrf oE mo
 # !PSlleHrf oE mo

  ...

It seem like "re-range" the word in this case. I also try to created some integer like int a = 123;, and print it out. But I got nothing when I run the simulation.

!! BUT !!

When I cancel the "Use Caches" box, everything go right.

Screenshots

Using caches : BUG1

Not using caches (include the integer test mentioned above) : BUG2 B2

Desktop (please complete the following information):

Additional context

I hope this is clear to let you know my issue. (sorry my poor English QAQ)

davide-giri commented 3 years ago

Hi @AltiumHanChou, thank you for the very detailed description of the issue. The problem is that at the moment ESP doesn't fully support enabling the cache hierarchy when selecting the Ibex core. We are going to look into it and let you know more soon.

Do you specifically need Ibex + caches support or were you just experimenting with ESP? The cache hierarchy does work with the Ariane and Leon3 processors in case you want to try that.

AltiumHanChou commented 3 years ago

Hi @davide-giri ~

Thank you for your response. I appreciate that.

Because I need to run under the risc V environment, I choice ibex (also choice Ariane before). But I got issue with Ariane core before,so I just choose ibex. Anyway, I will test with Ariane core again. If I get any issue when I test, I will let you know~

Thanks Altium Chou

paulmnt commented 3 years ago

Hi @AltiumHanChou. We are looking into the ibex issue. Ibex has been added as an option to generate small and low-port systems without cache hierarchy. However, I believe the issue you reported is a simple endianness problem, which we should be able to fix quickly. We will get back to you shortly on that.

Ariane, on the other hand, should work properly with and without caches. Linux SMP support for Ariane is in the works and will be part of next release.

Could you please share with us the problem you encountered using Ariane with the cache hierarchy, including whether you see that issue only with the HLS-based implementation of the caches or with the RTL version as well?

thank you!

AltiumHanChou commented 3 years ago

Hi @paulmnt ~

Sure ~ Should I open another issue for Ariane core ?

paulmnt commented 3 years ago

Yes, please, that way we can link different patches to each issue. Thank you so much!

paulmnt commented 3 years ago

@AltiumHanChou

We just fixed the issue when selecting Ibex and SystemVerilog caches on the branch ibex-caches: 5470f93

We will push another patch for the SystemC version very soon.

AltiumHanChou commented 3 years ago

Hi @paulmnt ~

Sorry, me again.

I modify the file you mention in this branch :

fixed the issue when selecting Ibex and SystemVerilog caches on the branch ibex-caches: 5470f93

But I occur a error with this tutorial "How to: design an accelerator in SystemC (Cadence Stratus HLS)", when I run RTL simulation with the core ibex (use SystemVerilog caches) .

I got this log when I run the command "`run -a (in modelsim)".

Full log : ibex_acc_cache.log

Excerpt :

#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_cpu_tile_services_gen/ahbslv2noc_3/payload_address @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/proxy/ahbslv2noc.vhd:81)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_cpu_tile_services_gen/leon3_with_cache_coherence/ahbslv2noc_1/payload_address @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/proxy/ahbslv2noc.vhd:81)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/cache_ahbsi @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:309)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/cache_ahbsi @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:309)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/with_cache_coherence/l2_wrapper_1/ahbm_reg_next @ sub-iteration 0 at Value Z (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/caches/l2_wrapper.vhd:254)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value X (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value X (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ibex_cpu_gen/ibex_ahb_wrap_1/rin @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/cores/ibex/ibex_ahb_wrap.vhd:95)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ibex_cpu_gen/ibex_ahb_wrap_1/rin @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/cores/ibex/ibex_ahb_wrap.vhd:95)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ibex_cpu_gen/ibex_ahb_wrap_1/rin @ sub-iteration 0 at Value X (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/cores/ibex/ibex_ahb_wrap.vhd:95)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ibex_cpu_gen/ibex_ahb_wrap_1/rin @ sub-iteration 0 at Value 1 (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/cores/ibex/ibex_ahb_wrap.vhd:95)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ahbmo @ sub-iteration 0 at Value X (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:293)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/ibex_cpu_gen/ibex_ahb_wrap_1/instr_rvalid_i @ sub-iteration 0 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/cores/ibex/ibex_ahb_wrap.vhd:71)

I also found this :

#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/cache_ahbso @ sub-iteration 2 at Value X (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/tiles/tile_cpu.vhd:310)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/rin @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:352)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/rin @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:352)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/rin @ sub-iteration 2 at Value Z (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:352)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/rin @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:352)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/lmsti @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:356)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/lslvi @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:357)
#       Signal: /testbench/cpu/esp_1/tiles_gen(1)/cpu_tile/tile_cpu_i/leon3_bus_gen/ahb0/lslvi @ sub-iteration 2 at Value unknown (/users/student/mr108/thchou19/ESP_Platform/esp/rtl/sockets/bus/ahbctrl.vhd:357)

Why it use "leon3 bus" in this core ? (Is something go wrong when I modify the file?)

BTW, Can you teach me how to check(or fix) this bug? Maybe next time I got error similar I can try it by myself first.

Thanks

Altium Han

paulmnt commented 3 years ago

@AltiumHanChou : this error message is something we've never seen and, given that the design passes logic synthesis and implementation, I am surprised that the simulator detects a loop. Unless, this has been a consequence of the patch I just applied. I will investigate and get back to you.

As for the leon3 bus instance, that is correct: Ibex does not come with a bus interface, so we have implemented an AHB adapter that converts instruction and data accesses to AHB requests. Then, we reused the leon3 AHB bus controller in the CPU tile. So, your configuration is correct.

-- edit -- I don't think this would cause the error you report, but please note that the default size of the memory model in simulation is 2 MB, of which 1 can be used to allocate accelerators data. If you know your application needs more, consider expanding the size of the memory model here. In general, since the bitstream generation is fully automated, we tend to run large test cases on FPGA, where you can leverage 1 GB from the DDR on any of the supported boards.

paulmnt commented 3 years ago

Hi @AltiumHanChou ,

I could not dig into the issue until today. I found the problem and pushed a patch to the ibex-caches branch.

The ESP private L2 cache sends invalidation messages to the L1 cache in the embedded processors. When the AHB bus is present in the CPU tile, the invalidation messages are issued as AHB write transactions. Ibex, however, does not have L1 caches and its AHB adapter assumes there are no other masters on the bus. Disabling the invalidate interface on AHB solves the timing loop.

Thank you so much for reporting this bug. The patch will be part of the next release, but the branch should work for you in the meantime. Please let us know if you encounter additional problems.

paulmnt commented 3 years ago

This issue should be fixed on the ESP devel branch