Many thanks for the wonderful projects (both SpinalHDL and VexRiscv). I think I have found a nasty bug in the implementation of Tightly Coupled Memory.
Summary
When a conditional branch instruction is directly followed by a store instruction writing to TCM, and the operand of the branch is within the TCM address range, then the TCM will be unconditionally written to. Other types of instructions following such branches might also be affected (not investigated).
Minimal Working Example
tcmbug.scala
import spinal.core._
import vexriscv._
import vexriscv.ip._
import vexriscv.plugin._
class TCMBug extends Component {
val cpu = new VexRiscv(VexRiscvConfig(Seq(
new IBusCachedPlugin(
resetVector = 0x10000000l,
config = InstructionCacheConfig(
cacheSize = 4096,
bytePerLine = 8,
wayCount = 1,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
asyncTagMemory = false,
),
),
new DBusCachedPlugin(
config = DataCacheConfig(
cacheSize = 4096,
bytePerLine = 8,
wayCount = 1,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchAccessError = true,
catchIllegal = true,
catchUnaligned = true,
),
),
new IBusDBusCachedTightlyCoupledRam((0x10000000l, 1 KiB),
ramAsBlackbox = false,
hexInit = "tcmbug.hex",
ramOffset = 0x10000000l),
new StaticMemoryTranslatorPlugin(_ => False),
new DecoderSimplePlugin,
new RegFilePlugin(plugin.ASYNC),
new SrcPlugin,
new IntAluPlugin,
new HazardSimplePlugin(),
new BranchPlugin(earlyBranch = false),
new CsrPlugin(CsrPluginConfig.smallest(0)),
)))
for (plugin <- cpu.plugins) plugin match {
case plugin : IBusCachedPlugin =>
plugin.iBus.cmd.ready := True
plugin.iBus.rsp.valid := True
plugin.iBus.rsp.data := 0
plugin.iBus.rsp.error := True
case plugin : DBusCachedPlugin =>
plugin.dBus.cmd.ready := True
plugin.dBus.rsp.valid := True
plugin.dBus.rsp.data := 0
plugin.dBus.rsp.error := True
case plugin : CsrPlugin =>
plugin.timerInterrupt := False
plugin.externalInterrupt := False
case _ =>
}
}
object TCMBug extends App {
import spinal.core.sim._
SimConfig.withFstWave.compile(new TCMBug).doSim{ dut =>
dut.clockDomain.forkStimulus(10)
sleep(10000)
}
}
As you can see above, the first store (instruction 1000000c) is properly halted and not executed. The second store (instruction 10000018) however gets written in the TCM, even though the instruction should also be skipped.
From what I understand reading through the code, the following happens:
SrcPlugin.scala, line 77: SRC_ADD is set to the operand of the branch instruction (x1 in our case) or the destination address of a store.
DBusCachedPlugin.scala, line 439: MEMORY_TIGHTLY is set in the execute stage whenever SRC_ADD is in the range of the TCM, regardless of the actual instruction being executed.
DBusCachedPlugin.scala, line 478: HAS_SIDE_EFFECT is overridden to low in the memory stage whenever MEMORY_TIGHTLY is set. Why is this done regardless of the instruction being executed?
DBusCachedPlugin.scala, line 442: Because HAS_SIDE_EFFECT is low in the memory stage (processing the branch), the execute stage (processing the store) is not halted.
Hello,
Many thanks for the wonderful projects (both SpinalHDL and VexRiscv). I think I have found a nasty bug in the implementation of Tightly Coupled Memory.
Summary
When a conditional branch instruction is directly followed by a store instruction writing to TCM, and the operand of the branch is within the TCM address range, then the TCM will be unconditionally written to. Other types of instructions following such branches might also be affected (not investigated).
Minimal Working Example
tcmbug.scala
tcmbug.hex
Compiled from the following two files with:
tcmbug.S
tcmbug.ld
Simulation Result
Full waveform
As you can see above, the first store (instruction 1000000c) is properly halted and not executed. The second store (instruction 10000018) however gets written in the TCM, even though the instruction should also be skipped.
From what I understand reading through the code, the following happens:
SRC_ADD
is set to the operand of the branch instruction (x1 in our case) or the destination address of a store.MEMORY_TIGHTLY
is set in the execute stage wheneverSRC_ADD
is in the range of the TCM, regardless of the actual instruction being executed.HAS_SIDE_EFFECT
is overridden to low in the memory stage wheneverMEMORY_TIGHTLY
is set. Why is this done regardless of the instruction being executed?HAS_SIDE_EFFECT
is low in the memory stage (processing the branch), the execute stage (processing the store) is not halted.