CommonEvaluationPlatform / CEP

The Common Evaluation Platform (CEP), based on UCB's Chipyard Framework, is an SoC design that contains only license-unencumbered, freely available components.
BSD 3-Clause "New" or "Revised" License
19 stars 6 forks source link

CEP Co-Simulation using QuestaSim #26

Closed Omar-M-Yehia closed 2 months ago

Omar-M-Yehia commented 6 months ago

I need to know how to fix this error during simulation of some instructions cep_tb.cpuId[0].driver C_ERROR: Time out while waiting for Pass/Fail

I am using QuestaSim 2023.4 version I followed all the steps mentioned in CEP/sims/cep_cosim/README.md an I got this error during the step of running make runAll in CEP/sims/cep_cosim/testSuites/isaTests/Makefile

bchetwynd commented 6 months ago

Can you verify that the steps in building the isaTests executed corrected? I have not explicitly tested on QuestaSim 2023.4 and will need to see if I can replicated the issue.

Can you verify that the PassFail.hex for the particular test is present?

Do other of the isaTests pass?

Omar-M-Yehia commented 6 months ago

Yes I succeeded to make the tests pass when I run them locally on my machine, however I faced this issue when I run the tests on the grid. I am trying to debug the Error so I need to know when tests fail with this error and how to overcome this error. In the following file CEP/sims/cep_cosimdrivers/cep_tests/cep_apis.cc: , this message "Time out while waiting for Pass/Fail" is printed when the there is no pass or fail status as shown image

Omar-M-Yehia commented 6 months ago

Hi Brendon ,

This is a kindly reminder. Could you explain to me when this msg is flagged "Time out while waiting for Pass/Fail" and how to overcome the issue

Thanks, Omar

bchetwynd commented 6 months ago

Omar,

While I am actively working on the v4.5 release, I will note that some of the tests don't work... and thus timeout.

For example, the -v- tests are known to fail, due to a lack of proper virtual mode support in our codebase.

Do some tests pass on the grid while others do not?

As to your note that they pass locally, but not on the grid... what is the configuration difference between the environments?

Omar-M-Yehia commented 5 months ago

Hi Brendon,

Unfortunately some of -p- tests are failing with timeout too, I am trying to investigate the issue for example this test rv64mi-p-breakpoint I can find PassFail.hex as shown

@3 0000000080000040 // 0000000080000040 : @1 00000000800002b4 // 00000000800002b4 : @0 00000000800002d0 // 00000000800002d0 :

in the transcript all CPU are passing only CPU[0] fail after timeout , I tried to increase timeout to 1000 as shown

image

My question here why CPU[0] didn't get an pass or fail status and fail with timeout in simulation as shown

bchetwynd commented 5 months ago

Omar, I appreciate the detailed debug info. I am curious, have you changed the chipyard configuration?

When you build the chip, did you run make -f Makefile.chipyard? Did you leave the default subproject of cep_cosim?

Omar-M-Yehia commented 5 months ago

Yes I Followed all the steps mentioned, and I didn't change any thing in chipyard configuration. I build the chip on 15 NOV 2023.

bchetwynd commented 5 months ago

Omar, unfortunately, I don't have an immediate answer for you.

Using Questa2023.4, I just ran the rv64mi-p-breakpoint test on CEP v4.41:

# ** Warning: (vsim-7032) The 64-bit glibc RPM does not appear to be installed on this machine.  Calls to gcc may fail.
# 
# Compiling /tmp/br24169@xUbuntu-2004-br24169_dpi_610478/linux_x86_64_gcc-11/exportwrapper.c
# Loading /tmp/br24169@xUbuntu-2004-br24169_dpi_610478/linux_x86_64_gcc-11/vsim_auto_compile.so
# Loading /home/br24169/projects/CEP/CEP_v4.41/sims/cep_cosim/lib/libvpp.so
# 
# do /home/br24169/projects/CEP/CEP_v4.41/sims/cep_cosim/testSuites/isaTests/rv64mi-p-breakpoint/vsim.do
# 1
# 1
# INFO:        0 cep_tb.cpuId[0].driver.force_tile_reset Forcing Tile #0 in reset...
# INFO:        0 cep_tb.cpuId[1].driver.force_tile_reset Forcing Tile #1 in reset...
# INFO:        0 cep_tb.cpuId[2].driver.force_tile_reset Forcing Tile #2 in reset...
# INFO:        0 cep_tb.cpuId[3].driver.force_tile_reset Forcing Tile #3 in reset...
# INFO:        0 cep_tb.system_driver ==== ISA RISCV_TESTS is active ===
# INFO:        0 cep_tb.system_driver Reading from PassFail.hex: pass = 0x800002a0, fail = 0x80000284, finish = 0x0, write_tohost = 0x80000040, hangme = 0x0
# INFO:        0 cep_tb.system_driver GetTheKey mKey = 0x00006003, mActiveMask = 0x000000000000000f from C2VKeyFile.KEY
# INFO:        0 cep_tb.system_driver shMemInit: Client: shMem->InitMe Done
# INFO:        1 cep_tb.system_driver BARE_MODE: Forcing scratch_word0[3:0], thus Disabling UART and SD Boot in the BootROM...
# INFO:        1 cep_tb.cpuId[0].driver GetTheKey mKey = 0x00006003, mActiveMask = 0x000000000000000f from C2VKeyFile.KEY
# INFO:        1 cep_tb.cpuId[0].driver shMemInit: Client: shMem->InitMe Done
# INFO:        1 cep_tb.cpuId[1].driver GetTheKey mKey = 0x00006003, mActiveMask = 0x000000000000000f from C2VKeyFile.KEY
# INFO:        1 cep_tb.cpuId[1].driver shMemInit: Client: shMem->InitMe Done
# INFO:        1 cep_tb.cpuId[2].driver GetTheKey mKey = 0x00006003, mActiveMask = 0x000000000000000f from C2VKeyFile.KEY
# INFO:        1 cep_tb.cpuId[2].driver shMemInit: Client: shMem->InitMe Done
# INFO:        1 cep_tb.cpuId[3].driver GetTheKey mKey = 0x00006003, mActiveMask = 0x000000000000000f from C2VKeyFile.KEY
# INFO:        1 cep_tb.cpuId[3].driver shMemInit: Client: shMem->InitMe Done
# INFO:       10 cep_tb.system_driver Turning ON __shIpc_EnableMode simTime = 1
# INFO:       11 cep_tb.cpuId[0].driver Calling $vpp_WaitTilStart slot = 0 cpu = 0
# INFO:       11 cep_tb.cpuId[0].driver Turning ON __shIpc_EnableMode simTime = 1
# INFO:       11 cep_tb.cpuId[1].driver Calling $vpp_WaitTilStart slot = 0 cpu = 1
# INFO:       11 cep_tb.cpuId[1].driver Turning ON __shIpc_EnableMode simTime = 1
# INFO:       11 cep_tb.cpuId[2].driver Calling $vpp_WaitTilStart slot = 0 cpu = 2
# INFO:       11 cep_tb.cpuId[2].driver Turning ON __shIpc_EnableMode simTime = 1
# INFO:       11 cep_tb.cpuId[3].driver Calling $vpp_WaitTilStart slot = 0 cpu = 3
# INFO:       11 cep_tb.cpuId[3].driver Turning ON __shIpc_EnableMode simTime = 1
# INFO:       15 cep_tb.cpuId[3].driver Entering shIpc_EnableMode Loop simTime = 2
# INFO:       15 cep_tb.cpuId[2].driver Entering shIpc_EnableMode Loop simTime = 2
# INFO:       15 cep_tb.cpuId[1].driver Entering shIpc_EnableMode Loop simTime = 2
# INFO:       15 cep_tb.cpuId[0].driver Entering shIpc_EnableMode Loop simTime = 2
# INFO:       15 cep_tb.system_driver Entering shIpc_EnableMode Loop simTime = 2
# INFO:       15 cep_tb.cpuId[1].driver SingleCoreOnly = 1
# INFO:       15 cep_tb.cpuId[2].driver SingleCoreOnly = 1
# INFO:       15 cep_tb.cpuId[3].driver SingleCoreOnly = 1
#        3 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#        5 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=1000
#        5 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=1000
#        5 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=1000
#        5 cep_tb.system_driver C_LOG: loadMemory: Loading file ./riscv_wrapper.img to SCRATCHPAD RAM with maxByteCntof 655360B , fileOffset = 0B, destOffset = 0B
#     1007 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#     1009 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=1000
#     1009 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=1000
#     1009 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=1000
#     2011 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#     2013 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=1000
#     2013 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=1000
#     2013 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=1000
#     2068 cep_tb.system_driver C_LOG: loadMemory: flushing cache line
#     2080 cep_tb.system_driver C_LOG: loadMemory: Setting program loaded flag
# INFO:    20835 cep_tb.system_driver Program is now loaded
#     3015 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#     3017 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=1000
#     3017 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=1000
#     3017 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=1000
#     4017 cep_tb.cpuId[0].driver C_LOG: OK: Program loading is completed
#     4019 cep_tb.cpuId[3].driver C_LOG: OK: Program loading is completed
#     4019 cep_tb.cpuId[2].driver C_LOG: OK: Program loading is completed
#     4019 cep_tb.cpuId[1].driver C_LOG: OK: Program loading is completed
# INFO:    40205 cep_tb.cpuId[0].driver.release_tile_reset Releasing Tile #0 reset...
# INFO:    40225 cep_tb.cpuId[1].driver.release_tile_reset Releasing Tile #1 reset...
# INFO:    40225 cep_tb.cpuId[2].driver.release_tile_reset Releasing Tile #2 reset...
# INFO:    40225 cep_tb.cpuId[3].driver.release_tile_reset Releasing Tile #3 reset...
#     4030 cep_tb.cpuId[0].driver C_LOG: Current Pass = 0x0 Fail = 0x0 maxTimeOut = 100
#     4031 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#     4032 cep_tb.cpuId[3].driver C_LOG: Current Pass = 0x0 Fail = 0x0 maxTimeOut = 100
#     4032 cep_tb.cpuId[2].driver C_LOG: Current Pass = 0x0 Fail = 0x0 maxTimeOut = 100
#     4032 cep_tb.cpuId[1].driver C_LOG: Current Pass = 0x0 Fail = 0x0 maxTimeOut = 100
#     4033 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=1000
#     4033 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=1000
#     4033 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=1000
# INFO:    40475 cep_tb.cpuId[3].driver C3 Pass/Fail Detected!!!... Put it to sleep
# INFO:    40555 cep_tb.cpuId[2].driver C2 Pass/Fail Detected!!!... Put it to sleep
# INFO:    40635 cep_tb.cpuId[1].driver C1 Pass/Fail Detected!!!... Put it to sleep
#     5037 cep_tb.cpuId[0].driver C_LOG: Current Pass = 0x0 Fail = 0x0 maxTimeOut = 99
#     5038 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=1000
#     5039 cep_tb.cpuId[3].driver C_LOG: Current Pass = 0x1 Fail = 0x0 maxTimeOut = 99
#     5039 cep_tb.cpuId[2].driver C_LOG: Current Pass = 0x1 Fail = 0x0 maxTimeOut = 99
#     5039 cep_tb.cpuId[1].driver C_LOG: Current Pass = 0x1 Fail = 0x0 maxTimeOut = 99
#     5040 cep_tb.cpuId[3].driver C_LOG: access::RunClk mSlotId=0 mLocalId=3 numClk=100
#     5040 cep_tb.cpuId[2].driver C_LOG: access::RunClk mSlotId=0 mLocalId=2 numClk=100
#     5040 cep_tb.cpuId[1].driver C_LOG: access::RunClk mSlotId=0 mLocalId=1 numClk=100
#     5142 cep_tb.cpuId[3].driver C_LOG: ======== TEST PASS ========== 
#     5142 cep_tb.cpuId[2].driver C_LOG: ======== TEST PASS ========== 
#     5142 cep_tb.cpuId[1].driver C_LOG: ======== TEST PASS ========== 
# cep_tb.cpuId[3].driver InactiveStatus detected..Shutting down!!!
# cep_tb.cpuId[2].driver InactiveStatus detected..Shutting down!!!
# cep_tb.cpuId[1].driver InactiveStatus detected..Shutting down!!!
# INFO:    55445 cep_tb.cpuId[0].driver C0 Pass/Fail Detected!!!... Put it to sleep
#     6044 cep_tb.cpuId[0].driver C_LOG: Current Pass = 0x1 Fail = 0x0 maxTimeOut = 98
#     6045 cep_tb.cpuId[0].driver C_LOG: access::RunClk mSlotId=0 mLocalId=0 numClk=100
#     6147 cep_tb.cpuId[0].driver C_LOG: ======== TEST PASS ========== 
# cep_tb.cpuId[0].driver InactiveStatus detected..Shutting down!!!
#     6148 cep_tb.system_driver C_LOG: main ======== TEST PASS ========== 
# INFO:    61495 cep_tb.system_driver Running for 1 more NS before terminate the simv process
# ** Note: $finish    : /home/br24169/projects/CEP/CEP_v4.41/sims/cep_cosim/dvt/dpi_common.incl(393)
#    Time: 61505 ns  Iteration: 0  Instance: /cep_tb/system_driver
# End time: 07:27:06 on Jan 29,2024, Elapsed time: 0:00:10
# Errors: 0, Warnings: 2

The only other thing I can think of is the the gcc version you used to build the riscv-toolchain.

Omar-M-Yehia commented 3 months ago

Hi Brendon, I found the reason of the issue, that When I run multiple tests in parallel some of tests fails with the following error.

Did you face this issue before

bchetwynd commented 3 months ago

Hello, Omar. Yes, I have seen that before. Not all of the RISC-V ISA test terminate as expected. I have yet to debug the issue.

Our ISA test methodology is to extract the Pass / Fail state from the .dump file. Then the simulation looks for the PC to hit either the pass or fail exit points.

Omar-M-Yehia commented 3 months ago

Hello Brendon, Thanks for your reply, please inform me once you have checked the issue and found how to pass it

bchetwynd commented 3 months ago

Hello Omar. Happy so to do, but I would not expect it anytime soon. As noted, some of the tests currently pass, some do not. My current efforts are focused elsewhere.