ucb-bar / chipyard

An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
https://chipyard.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
1.67k stars 657 forks source link

WithNBoomCores not working with WithJtagDTM in release 1.3.0 #656

Closed avonancken closed 4 years ago

avonancken commented 4 years ago

Impact: rtl

Tell us about your environment: Chipyard Version: 1.3.0 OS: Linux ubuntu19portable 5.3.0-64-generic 58-Ubuntu SMP Fri Jul 10 19:33:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Other: Configurations including WithNBoomCores and WithJtagDTM work correctly using OpenOCD and gdb in both Verilator simulation and on an FPGA prototype with Chipyard release 1.2.0. The same configurations fail when starting gdb using Chipyard release 1.3.0. The same configuration replacing WithNBoomCores with WithNBigCores(RocketTile) works successfully in both releases.

Working RocketTile based Config: class jtagRocketConfig extends Config( new chipyard.iobinders.WithUARTAdapter ++ new chipyard.iobinders.WithTieOffInterrupts ++ new chipyard.iobinders.WithBlackBoxSimMem ++ new chipyard.iobinders.WithSimDebug ++ // add SimJtag and SimSerial, use both to drive sim new chipyard.iobinders.WithSimSerial ++ new testchipip.WithTSI ++ new chipyard.config.WithBootROM ++ new chipyard.config.WithUART ++ new chipyard.config.WithL2TLBs(1024) ++ new freechips.rocketchip.subsystem.WithJtagDTM ++ // sets DTM communication interface to JTAG new freechips.rocketchip.subsystem.WithNoMMIOPort ++ new freechips.rocketchip.subsystem.WithNoSlavePort ++ new freechips.rocketchip.subsystem.WithInclusiveCache ++ new freechips.rocketchip.subsystem.WithNExtTopInterrupts(0) ++ new freechips.rocketchip.subsystem.WithNBigCores(1) ++ new freechips.rocketchip.subsystem.WithCoherentBusTopology ++ new freechips.rocketchip.system.BaseConfig)

Failing BoomTile Config (fails with Chipyard 1.3.0 but works with Chipyard 1.2.0): class jtagBoomConfig extends Config( new chipyard.iobinders.WithUARTAdapter ++ new chipyard.iobinders.WithTieOffInterrupts ++ new chipyard.iobinders.WithBlackBoxSimMem ++ new chipyard.iobinders.WithSimDebug ++ // add SimJtag and SimSerial, use both to drive sim new chipyard.iobinders.WithSimSerial ++ new testchipip.WithTSI ++ new chipyard.config.WithBootROM ++ new chipyard.config.WithUART ++ new chipyard.config.WithL2TLBs(1024) ++ new freechips.rocketchip.subsystem.WithJtagDTM ++ // sets DTM communication interface to JTAG new freechips.rocketchip.subsystem.WithNoMMIOPort ++ new freechips.rocketchip.subsystem.WithNoSlavePort ++ new freechips.rocketchip.subsystem.WithInclusiveCache ++ new freechips.rocketchip.subsystem.WithNExtTopInterrupts(0) ++ new boom.common.WithNBoomCores(1) ++ new freechips.rocketchip.subsystem.WithCoherentBusTopology ++ new freechips.rocketchip.system.BaseConfig)

What is the current behavior? GDB fails to connect to the openOCD target using the 'target remote localhost:3333' command with the following error message: bfd requires flen 8, but target has flen 0 This happens because it reads 0x0 from the misa CSR which is also evident in the OpenOCD status: Info : hart 0: XLEN=64, misa=0x0 When I then start gbd with an executable built with march=rv64imac instead of march=rv64imafdc I can establish the remote connection without error but then I issue the gdb command 'info all-registers' and it returns 0x0 for the value of all of the CSRs and GPRs.

What is the expected behavior? The misa register should return 0x800000000094112d and the other registers should return the expected non-zero value.

Other information I did find that when I back rev'ed the 1.3.0 release to commit 7c7b336c3fef8aaf16a7266d95b3fbd9e7d9418b from May 14th I do get the correct behavior.

jerryz123 commented 4 years ago

This is odd. Does the same issue manifest with the rocket core?

Would it be possible for you to collect a waveform of the error case? It would significantly expedite the debugging process.

avonancken commented 4 years ago

No, the problem does not manifest with the rocket core. You can see from my original description, the only difference between the configurations is 'freechips.rocketchip.subsystem.WithNBigCores(1)' (RocketTile) in the working case versus 'boom.common.WithNBoomCores(1)'(BoomTile) in the failing case.

I would be happy to share a waveform database of the problem case. I have already collected waveforms and tracked the error down to the transactions from the BoomTile to the DTM where it is writing zeros where it should be writing non-zero values but that is as far as I could debug with my limited familiarity with the design.

Please let me know the best way to share the waveform database with you. Thank you.

jerryz123 commented 4 years ago

Thanks to @avonancken 's help, I believe the issue is due to the transition between BOOMv2 to BOOMv3, in which a necessary hack to avoid caching the debug ram was omitted. The fix is here: https://github.com/riscv-boom/riscv-boom/pull/486

Lets leave this bug report open until the fix is merged in.

avonancken commented 4 years ago

Thank you @jerryz123 for quick and excellent sleuthing. I can confirm that applying your fix to the riscv-boom does resolve the issue.

avonancken commented 4 years ago

Unfortunately, I have run into additional problems when I added 'new boom.common.WithLargeBooms ++' to my configuration to change from the SmallBooms tile configuration with single issue to the LargeBooms configuration that is multi-issue(3).

The same test case now fails to start OpenOCD properly, failing with the following error messages:

Open On-Chip Debugger 0.10.0+dev-00861-g3c6592cf6 (2020-06-17-18:17)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
Info : Initializing remote_bitbang driver
Info : Connecting to localhost:9823
Info : remote_bitbang driver initialized
Info : This adapter doesn't support configurable speed
Info : JTAG tap: riscv.cpu tap/device found: 0x00000001 (mfg: 0x000 (<invalid>), part: 0x0000, ver: 0x0)
Info : datacount=2 progbufsize=16
Error: fflush: Broken pipe
Error: read: count=-1, error=Broken pipe
Error: dmi_scan failed jtag scan
Error: Failed read (NOP) at 0x16; value=0xffffffff, status=2
Error: fflush: Broken pipe
Error: fflush: Broken pipe
Error: dmi_scan failed jtag scan
Error: failed write at 0x16, status=2
Error: fflush: Broken pipe
Error: fflush: Broken pipe
Error: dmi_scan failed jtag scan
Error: failed write at 0x17, status=2
Error: fflush: Broken pipe
Error: fflush: Broken pipe
Error: dmi_scan failed jtag scan
Error: failed write at 0x17, status=2
Error: Fatal: Failed to read MISA from hart 0.
Info : Listening on port 3333 for gdb connections
Error: Target not examined yet

Error: fflush: Broken pipe
Error: failed: -4

The Verilator simulation process also stops on a failed assertion:

This emulator compiled with JTAG Remote Bitbang client. To enable, use +jtag_rbb_enable=1.
Listening on port 9823
[UART] UART0 is here (stdin/stdout).
Attempting to accept client socket
Accepted successfully.[41] %Error: chipyard.TestHarness.jtagRocketConfig.top.v:668806: Assertion failed in TOP.TestHarness.dut.system.boom_tile.core
%Error: /home/goom/chipyard/sims/verilator/generated-src/chipyard.TestHarness.jtagRocketConfig/chipyard.TestHarness.jtagRocketConfig.top.v:668806: Verilog $stop
Aborting...

It seems that there may be some problem with the workaround when the Boom core is configured for multi-issue.

avonancken commented 4 years ago

Thank you again @jerryz123 . Your follow on fix, [ifu] Fix spurious ICache flush by invalid instructions riscv-boom/riscv-boom#490, resolves this problem with the LargeBooms configuration. I can now run the full Verilator OpenOCD/GDB test successfully. I also created an FPGA image with this fix and verified that the appropriate RISC-V ISA tests function correctly and the Dhrystone and Coremark benchmarks also work as expected when running through OpenOCD/gdb.