lowRISC / qemu

Fork of QEMU for development of lowRISC platforms (including OpenTitan)
http://www.qemu.org
Other
3 stars 8 forks source link

Cannot single step in rom0 #63

Closed stefanberger closed 4 months ago

stefanberger commented 5 months ago

I have started your version of Qemu with the following command line and I see the first instructions in gdb-multiarch but cannot do a stepi to start single stepping. It simply hangs at adress 0x8080. I am running this on an x86_64 host:

./qemu-opentitan/build/qemu-system-riscv32   \
  -object ot-rom-img,id=rom0,file=./sbe/ot-rom.bin,digest=fake,addr=0x8080  \
  -gdb tcp::9000 \
  -nographic \
  -nodefaults \
  -monitor stdio \
  -m 128k  \
  -machine ot-darjeeling \
  -S

I have been using the upstream Qemu with the following command line and single stepping is not an issue (also on x86_64 host), though the start address here is 0x20000400.

./qemu/build/qemu-system-riscv32 \
   -bios ./bios.elf \
   -gdb tcp::9000 \
   -nographic \
   -nodefaults \
   -serial stdio \
   -m 128k  \
   -machine opentitan \
  -S

I am wondering what I may be doing wrong when trying to run the code out of rom0.

loiclefort commented 5 months ago

You may need -global ot-ibex_wrapper-dj.lc-ignore=on (see docs/opentitan/darjeeling.md section "Useful execution options") or provide an OTP image in the correct LifeCycle state to have the Ibex core start fetching.

One remark since you are using ot-darjeeling: the upstream opentitan machine is closer to ot-earlgrey (OpenTitan standalone) than ot-darjeeling (OpenTitan integrated). With ot-earlgrey you would not need the lc-ignore option (LifeCycle connection to Ibex is not yet emulated for EarlGrey).

stefanberger commented 5 months ago

@loiclefort Thanks. This -global ot-ibex_wrapper-dj.lc-ignore=on option enabled the single stepping. I am planning on using darjeeling because of larger ROM space.

Another question: How can I find the base address of RAM. Upstream opentitan shows this in info mtree:

(qemu) info mtree
address-space: I/O
  0000000000000000-000000000000ffff (prio 0, i/o): io

address-space: cpu-memory-0
address-space: memory
  0000000000000000-ffffffffffffffff (prio 0, i/o): system
    0000000000008000-000000000000ffff (prio 0, rom): riscv.lowrisc.ibex.rom
    0000000010000000-000000001001ffff (prio 0, ram): riscv.lowrisc.ibex.ram
    0000000020000000-00000000200fffff (prio 0, rom): riscv.lowrisc.ibex.flash
[...]

RAM there starts at 0x10000000. Where is RAM for ot-darjeeling? It doesn't seem to be this here:

    0000000010000000-000000001000ffff (prio 0, i/o): alias ot-sram-ctrl-mem @ot-sram-ctrl-mem-init 0000000000000000-000000000000ffff

Btw, darjeeling should maybe reject rom and require rom0 or rom1.

stefanberger commented 5 months ago

It looks like for ot-darjeeling one has to program the sram controller to get access to the sram, is that right? Since I didn't want to do this I added -global ot-sram-ctrl.noinit=on to the command line and now at least I can access the sram from gdb, but once the code writes data to the stack with sw s0,8(sp) it again stops...

stefanberger commented 5 months ago

After now initializing the SRAM controller and getting access to sram via x command on QEMU and gdb I still cannot read or write to the SRAM from code:

    # init sram
    lui  a5,0x411c0
    li   a4,1
    sw   a4,0x10(a5)  # write REGWEN to CTRL_REGWEN

    li   a4,2
    sw   a4,0x14(a5)  # write INIT to CTRL

    sw   a4,0x8(a5)   # write EN to EXEC_REGWEN

    # wait for INIT flag to be set
    li   a3,0x20
1:
    lw   a4,4(a5)
    bne  a4,a3,1b
    # init sram done

    # test memory access
    lui  a5,0x10000
    li   a4,0
    sw   a4,0(a5)

The sw fails and I get a store exception (mcause=0x7, mtval=0x10000000). Any idea what's going on with this memory?

loiclefort commented 5 months ago

The default PMP configuration only gives access to the first 2KB of ROM0, the Debug ROM and the MMIO region. You need to configure the PMP to allow SRAM access or disable the default PMP configuration (-M ot-darjeeeling,no_epmp_cfg=true).

(edit) Also, the first instruction in your code should be lui a5,0x211c0 (controller for the main SRAM is at 0x211c_0000)

rivos-eblot commented 5 months ago

Note that -m option is not used in Darjeeling machine. There are 3 memory controllers whose size cannot be changed from the command line. You need to edit hw/riscv/ot_darjeeling.c if you want to tweak the SoC definitions.

stefanberger commented 5 months ago

Thanks. Also I found this register description here: https://opentitan.org/book/hw/ip/sram_ctrl/index.html

I am programming the pmpcfg0 and pmpaddr0 to allow all addresses to 'work'. While playing around with earlgrey and darjeeling machines I noticed that they seem to behave differently while running first code in rom/rom0. On earlgrey I had gone into the loop waiting for the memory to be initialized, i.e. waiting for INIT_DONE in STATUS register. On darjeeling the INIT_DONE flag seems to be set right away and I end up having to write zeros to all addresses to initialize the memory space. Is this intended or did I 'step' onto something?

Also this here didn't seem right:

diff --git a/hw/opentitan/ot_sram_ctrl.c b/hw/opentitan/ot_sram_ctrl.c
index 25457460da..490d37112f 100644
@@ -409,7 +410,7 @@ static void ot_sram_ctrl_regs_write(void *opaque, hwaddr addr, uint64_t val64,
     case R_CTRL:
         if (s->regs[R_CTRL_REGWEN]) { /* WO */
             val32 &= R_CTRL_INIT_MASK | R_CTRL_RENEW_SCR_KEY_MASK;
-            uint32_t trig = (val32 ^ s->regs[val32]) & val32;
+            uint32_t trig = (val32 ^ s->regs[reg]) & val32;
             /* storing value prevents from trigerring again before completion */
             s->regs[reg] = val32;
             if (trig & R_CTRL_RENEW_SCR_KEY_MASK) {

Another question that I have is whether on darjeeling there are other ways to make a file with code or data memory mapped (besides rom0 and rom1)? I have been looking at the OTP tool you provide but I am not sure what to give it as input. I would like to give it an elf or binary as input... is that possible?

rivos-eblot commented 5 months ago

It seems indeed wrong, will have a look at it tomorrow. Thanks. Earlgrey port is not really maintained (lack of resources) at the moment. Best effort is not breaking what does work, but it is far less complete than Darjeeling is.

rivos-eblot commented 5 months ago

Another question that I have is whether on darjeeling there are other ways to make a file with code or data memory mapped (besides rom0 and rom1)? I have been looking at the OTP tool you provide but I am not sure what to give it as input. I would like to give it an elf or binary as input... is that possible?

The general rule for Darjeeling is that QEMU tries to closely follow the HW. This means it is supposed to boot from a hard-coded address with a predefined PMP configuration, execute the ROM code which should validate the next stage, i.e. load from SPI flash some verified code into SRAM or external RAM and execute from here.

However there are special option switches to disable those default behavior and let QEMU be more versatile than the hardcoded HW behavior. It is possible for example to load a simple ELF file, and QEMU should execute from its entry point, without requiring a ROM for example. What is needed is this use case:

OTP tool takes .vmem files as its input data stream, which are generated by the opentitan build system. The content of the OTP area is rather strict, and you cannot use it to execute code - at least it is definitely not designed to do so.

rivos-eblot commented 5 months ago

Also this here didn't seem right:

Good catch. There were actually a couple of other similar issues (copied n' pasted...): https://github.com/lowRISC/qemu/pull/64

stefanberger commented 5 months ago

@rivos-eblot The datasheet for darjeeling only mentions 64kb. So there are no larger RAM configuration possible and it's only 64kb forever? Ideally one would be able to adjust this with -m 128k for example but that's then outside the possible configuration?

Since I need more RAM I extended the RAM now locally. While initializing the memory with zeros (with an exception handler to intercept the write to unavailable memory) I determine the largest address available in main RAM. I suppose there's no other way to detect where it ends.

rivos-eblot commented 5 months ago

I'm not sure the specs for Darjeeling are frozen yet. There are several SRAMs in Darjeeling: main SRAM, always-on/retention SRAM, mailbox SRAM. Morever in a single machine, there may be several Darjeeling instances, so supporting -m option swich would target a somewhat arbitrary RAM. Moreover Darjeeling SoC can also be instantiated in a machine with more RAM or even shared RAM...

I do not think QEMU -m option should be used for complex SoC, it is better tailored for generic machines with a single (D)RAM area. Changing the default RAM from the command is not straightforward in other cases (machine with multiple RAMs), as there is no generic way to assign properties to a specific device in QEMU: the only available option is -global which targets -all- instances of a device at once, i.e. it would resize all the SRAM in the machine. Alternative is either to use QMP or to provide some configuration objects as we've done with ROM configuration (see TYPE_OT_ROM_IMG).

Note that detecting the last mapped address through bus error might also be fragile, as some device may be directly mapped after the last slot of a SRAM device, in which case you would end up writing in another device without noticing it. Nevetherless, it is actually ok for now for the main SRAM device.

I'm not aware of some auto discovery capabilities in OpenTitan HW. As the HW configuration may be tweaked at build time, even the relative position of registers in many OT device instances are impacted, not mentioning IRQ and ALERT signal mapping.

[TL; DR]: the guest SW needs to be compiled for a given Darjeeling configuration, unfortunately. This is also true for Earlgrey: we chose to implement the CW310 variant as it enables running the same binaries on both the FPGA and QEMU, but another variant may be different and requires SW code tweaks.

rivos-eblot commented 5 months ago

Btw, darjeeling should maybe reject rom and require rom0 or rom1.

I'm not sure to get this one. Darjeeling expects rom0 and rom1 arguments; were you referring to what is reported through info mtree?

loiclefort commented 5 months ago

The datasheet for darjeeling only mentions 64kb. So there are no larger RAM configuration possible and it's only 64kb forever?

Darjeeling reference SoC includes a 1MB CTN Shared SRAM, see top_darjeeling_pkg.sv. At least this is present in the open source reference design, a specific vendor may have different devices connected to CTN bus.

stefanberger commented 5 months ago

I have been playing around a bit with the PLIC and the timer. I noticed QEMU crashes when playing around with the cfg0 register and the step part of the register.

This value here works for me still:

    volatile unsigned int *cfg0 = (unsigned int *)(TIMER_BASE + 0x10c);

    *cfg0 = 0x100fff;

This one here causes a crash (added a 0):

    *cfg0 = 0x1000fff;

Also this one:

*cfg0 = 0x0000f;

Here's the traceback for both:

Thread 3 "qemu-system-ris" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffd15ff640 (LWP 921109)]
0x0000555555f24be0 in core::num::{impl#9}::checked_div (self=<optimized out>, rhs=<optimized out>) at /build/rustc-kAv1jW/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/num/mod.rs:1169
1169    /build/rustc-kAv1jW/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/num/mod.rs: No such file or directory.

I found the timer to generate irq #68 on the PLIC. I suppose also this number can be different by implementation or is it constant for Darjeeling? It seems to be 124 on Earlgrey.

stefanberger commented 5 months ago

Btw, darjeeling should maybe reject rom and require rom0 or rom1.

I'm not sure to get this one. Darjeeling expects rom0 and rom1 arguments; were you referring to what is reported through info mtree?

What I meant with it was that if one specifies rom for Darjeeling it should (maybe) reject the option typo since rom0 or rom1 are required.

rivos-eblot commented 5 months ago
    volatile unsigned int *cfg0 = (unsigned int *)(TIMER_BASE + 0x10c);

Which value are you using for TIMER_BASE? It should be 0x30100000

Thread 3 "qemu-system-ris" received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7fffd15ff640 (LWP 921109)] 0x0000555555f24be0 in core::num::{impl#9}::checked_div (self=, rhs=) at

I'm not sure how it ended up in some Rust code. As far as I can tell the only device that is implemented in Rust is the OTBN... You should try to enable the traces (-trace ...) so check which device is actually used, or run QEMU from GDB/LLDB.

I found the timer to generate irq #68 on the PLIC. I suppose also this number can be different by implementation or is it constant for Darjeeling? It seems to be 124 on Earlgrey.

The memory map and the IRQs are defined in ot_darjeeling.c so yes, for now it is #68.

stefanberger commented 5 months ago
    volatile unsigned int *cfg0 = (unsigned int *)(TIMER_BASE + 0x10c);

Which value are you using for TIMER_BASE? It should be 0x30100000

Yes, that's what I am using:

# define TIMER_BASE             0x30100000

Thread 3 "qemu-system-ris" received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7fffd15ff640 (LWP 921109)] 0x0000555555f24be0 in core::num::{impl#9}::checked_div (self=, rhs=) at

I'm not sure how it ended up in some Rust code. As far as I can tell the only device that is implemented in Rust is the OTBN... You should try to enable the traces (-trace ...) so check which device is actually used, or run QEMU from GDB/LLDB.

Its due to this here -- strangely gdb just dumped the error without backtrace:

static int64_t ot_timer_ticks_to_ns(OtTimerState *s, uint64_t ticks)
{
    uint32_t prescaler = FIELD_EX32(s->regs[R_CFG0], CFG0, PRESCALE);
    uint32_t step = FIELD_EX32(s->regs[R_CFG0], CFG0, STEP);
    fprintf(stderr, "%s @ %u   ticks=%ld step=%d prescaler=%d\n", __func__, __LINE__, ticks, step, prescaler);
    uint64_t ns = muldiv64(ticks, (prescaler + 1u), step);
    ns = muldiv64(ns, NANOSECONDS_PER_SECOND, s->pclk);
    if (ns > INT64_MAX) {
        return INT64_MAX;
    }

    return (int64_t)ns;
}
ot_timer_ticks_to_ns @ 112   ticks=10000 step=0 prescaler=15

step=0 is bad. If I set it to '1' when it's 0 it doesn't crash.

rivos-eblot commented 4 months ago

Good catch. Step == 0 is not supported and will be fixed.

stefanberger commented 4 months ago

Another question: You seem to be working with different watchdog registers than what the spec here uses. The spec has some 64bit registers for the wake-up timer where you only use 32bit registers, so at WDEG_REGWEN they are off-by-8. Which one is right, woof?

rivos-eblot commented 4 months ago

We've started OT emulation based on last year version and have not been updating Earlgrey version till https://github.com/lowRISC/opentitan/releases/tag/Earlgrey-M2.5.2-RC0 Any HW IPs that have been updated since this release have not been reflected in the QEMU OpenTitan implementation (lack or resources). AON timer has been updated early March 2024 (commit: 86a48f1c)

We are now only working OpenTitan Darjeeling which has no official miletone up to now. Moreover our current HW branch for Darjeeling is not fully aligned on the latest changes. I hope it can converge back to OpenTitan Darjeeling at some point.

rivos-eblot commented 4 months ago

Good catch. Step == 0 is not supported and will be fixed.

https://github.com/lowRISC/qemu/pull/65

stefanberger commented 4 months ago

I noticed that there was not bite after the bark. This fixes it for me:

diff --git a/hw/opentitan/ot_aon_timer.c b/hw/opentitan/ot_aon_timer.c
index c4ba28e82c..0f3b8d4e8c 100644
--- a/hw/opentitan/ot_aon_timer.c
+++ b/hw/opentitan/ot_aon_timer.c
@@ -258,18 +258,18 @@ static void ot_aon_timer_rearm_wdog(OtAonTimerState *s, bool reset_origin)
     uint32_t bite_threshold = s->regs[R_WDOG_BITE_THOLD];
     uint32_t threshold = 0;

-    if (count >= bark_threshold) {
-        s->regs[R_INTR_STATE] |= INTR_WDOG_TIMER_BARK_MASK;
-    } else {
-        threshold = bark_threshold;
-    }
-
     if (count >= bite_threshold) {
         s->wdog_bite = true;
-    } else if (bite_threshold < threshold) {
+    } else if (count < bite_threshold) {
         threshold = bite_threshold;
     }

+    if (count >= bark_threshold && !s->wdog_bite) {
+        s->regs[R_INTR_STATE] |= INTR_WDOG_TIMER_BARK_MASK;
+    } else if (bark_threshold < bite_threshold) {
+        threshold = bark_threshold;
+    }
+
     if (count >= threshold) {
         timer_del(s->wdog_timer);
     } else {
rivos-eblot commented 4 months ago

I noticed that there was not bite after the bark. This fixes it for me:

I'm surprised since when boot takes too much time the ROM gets rebooted by the watchdog. I let @loiclefort check since I do not really know this part. Could you open a new ticket for this issue and keep one topic per issue to help tracking down them? Do not forget to add the command line you use to start up QEMU, as different machines/options trigger different behavior. Thanks.