zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.86k stars 6.62k forks source link

[RISCV] running zephyr scheduler in user-/supervisor mode (instead of machine mode)? #68133

Open tswaehn opened 9 months ago

tswaehn commented 9 months ago

Is your feature request related to a problem? Please describe. It appears to me that on RISCV the expectation is, that zephyr runs in machine mode (guessing from here). I am wondering why this is needed. Lets say we have a supervisor running multiple applications, where zephyr is just one of them. Then zephyr should be able to run in USER mode.

Describe the solution you'd like I think it could be accomplished when using a generic hook for the "yield" and a generic hook for a timer interrupt. Additionally the switching routine could use defined CSRs (currently fixed machine mode CSRs are expected). Speaking of that actually there are RISCV cpus, where not all CSRs are implemented ex: mscratch might not be available => want to redefine and use another CSR instead?

Describe alternatives you've considered Copy paste the RISCV architecture, name it RISCV-IN-USER and replace the interrupt/ ECALL handling. Which might a bit messy - because actually only the interrupt handling need to be changed a little.

Did anybody already think about that - or is that completely non-sense?

carlescufi commented 9 months ago

@fkokosinski @carlocaione @npitre

npitre commented 9 months ago

Random notes:

I think moving Zephyr from m-mode to s-mode when available should be rather easy to do. More than that is probably doable given many restrictions, but whether if it is worth it or not pretty much depends on your motivation.

tswaehn commented 9 months ago

Thank you for your notes. I mostly agree with it.

Having Zephyr running entirely in u-mode would be more tricky. It wouldn't be able to support its own "user mode" threads of course, and some special abstractions around IRQ manipulation and handling would be needed.

Regarding the zephyr fully in user mode. I think of an oportunity where user interrupts can be used exactly the same way like in machine mode. instead of MTVEC = > UTVEC; MSTATUS = > USTATUS; MRET => URET, ... and so on.

Additonally I was wondering, if I would propose changes in that way and create a PR, would that be in the range of acceptable, or rejected, because of any strategic / theoretical / conceptional reasons.

npitre commented 9 months ago

MTVEC = > UTVEC; MSTATUS = > USTATUS; MRET = > URET, ... and so on.

Right. If UTVEC and friends are implemented then MSTATUS/SSTATUS/USTATUS etc. could be abstracted behind some macros. Linux does this:

#ifdef CONFIG_RISCV_M_MODE
# define CSR_STATUS     CSR_MSTATUS
# define CSR_IE         CSR_MIE
# define CSR_TVEC       CSR_MTVEC
# define CSR_SCRATCH    CSR_MSCRATCH
# define CSR_EPC        CSR_MEPC
# define CSR_CAUSE      CSR_MCAUSE
# define CSR_TVAL       CSR_MTVAL
# define CSR_IP         CSR_MIP
[...]
#else /* CONFIG_RISCV_M_MODE */
# define CSR_STATUS     CSR_SSTATUS
# define CSR_IE         CSR_SIE
# define CSR_TVEC       CSR_STVEC
# define CSR_SCRATCH    CSR_SSCRATCH
# define CSR_EPC        CSR_SEPC
# define CSR_CAUSE      CSR_SCAUSE
# define CSR_TVAL       CSR_STVAL
# define CSR_IP         CSR_SIP
[...]
#endif

And then only a few corner cases are conditionally selected with CONFIG_RISCV_M_MODE in the actual code. Applying this approach to Zephyr and extending it to u-mode as well (for available U regs) would be perfectly fine to me.

However, given this is deep architecture stuff, we'd need a way for CI to make sure the u-mode build won't go broken i.e. this needs to work using QEMU. I don't know if QEMU has e.g. UTVEC implemented but a quick grep seems to indicate it is not.

tswaehn commented 9 months ago

However, given this is deep architecture stuff, we'd need a way for CI to make sure the u-mode build won't go broken i.e. this needs to work using QEMU. I don't know if QEMU has e.g. UTVEC implemented but a quick grep seems to indicate it is not.

It may also need some m-mode/s-mode starting code to prepare the forwarding/delegating of interrupts from machine to user and also take care of actual timer setup (user may not have access to it directly). additionally I m wonding if syscall like openSBI could be used to communicate between user mode and machine mode supervisor.

achech commented 9 months ago

If you decide to use https://github.com/riscv-software-src/opensbi, it will takeover/hande the M-mode, leaving you with S and U mode for the OS.

tswaehn commented 9 months ago

If you decide to use https://github.com/riscv-software-src/opensbi, it will takeover/hande the M-mode, leaving you with S and U mode for the OS.

yes, then zephyr OS will run S or U mode then. ex. running a save device with secure/certified drivers and user zephyr OS for application.

additionally the user zephyr OS will be compatible to any host providing the opensbi interface.

con-pax commented 8 months ago

Just to note, I don't thing there is an equivalent to mhartid in lesser privileged modes (though, it could be passed in a0). There would be a chunk of work needed for SMP and the PLIC driver for instance. Also to note, I think Zephyr running in s-mode would be amazing! πŸ˜ƒ

achech commented 8 months ago

The mhartid is transferred in https://github.com/riscv-software-src/opensbi/blob/master/lib/sbi/sbi_hsm.c#L157 for both, the init_coldboot and init_warm_startup

tswaehn commented 8 months ago

checklist for changes

... in progress

con-pax commented 8 months ago

checklist for changes

  • [ ] custom version of z_soc_irq_lock()
  • [ ] replace __initialize - reading of HART from mhartid => uhartid
  • [ ] replace __soc_handle_irq()
  • [ ] just to be on safe side, there should be a RISCV toolchain option where the M-extension is not included

... in progress

Really cool! Let me know if there is anything you want tested from my side. I just so happen to have a board with a heterogenous core-complex sitting on my desk πŸ˜ƒ

tswaehn commented 8 months ago

zephyr has the option to be compiled for CONFIG_USERSPACE. this looks pretty good, except that syscall IDs are auto-generated vs. openSBI predefined following a fixed scheme. assuming that user space application and kernel is compiled in separate steps, I would prefer a fixed syscall ID numbering scheme. am I missing something here, and we can configure zephyr to use a fix syscall ID scheme?

achech commented 8 months ago

Enabling CONFIG_USERSPACE, just enables user (non-kernel) threads, to be ran with kernel ones. AFAIK Zephyr is compiled in one step.

npitre commented 8 months ago

Zephyr does not compile user space application and kernel separately. The "user space" in Zephyr is simply a way to isolate some threads by granting them a minimum amount of system privileges at run time. System call numbers are therefore only an internal implementation mechanism and not meant to be a public interface.

tswaehn commented 5 months ago

Yes, thats correct, the idea is as follows:

communication from USER to MACHINE via openSBI

currently zephyr is not meant to make use of the USER mode at all, but would be great to have the option to do that.

update

to get the ball rolling, we could start to prepare zephyr to be able to run in USER mode, ...

achech commented 5 months ago

By my opinion, what would be great, is to introduce (make use) of S-mode, as defined in Privileged Architecture Version 1.10. If it is still considered valid ? There, the S-mode is said to be "entry point defined by the system binary interface (SBI)"

tswaehn commented 5 months ago

Agree.

There are architectures with Machine / Supervisor / User mode or a subset (Machine / User) of it. So for first start, zephyr should be able to run in any mode Machine or Supervisor or User mode.

tswaehn commented 3 months ago

Before that fix, zephyr was able to start only in machine mode. With that fix, zephyr can start (as guest) in user mode.

the idea is:

PR: https://github.com/zephyrproject-rtos/zephyr/pull/76682

notes:

I tested both: starting zephyr in machine mode and user mode. works for me.

tswaehn commented 3 months ago

with this fix, we could run something like this:

image

where "Operation System Kernel" is, could actually be zephyr.

source

side note

I tested this on my hardware, including the (customized) openSBI => it works PR #76682

tswaehn commented 2 months ago

just added https://github.com/zephyrproject-rtos/zephyr/pull/76682 drivers for openSBI running in qemu (qemu-system-riscv32). please find instructions below to test:

make PLATFORM=generic CROSS_COMPILE=riscv32-unknown-elf- PLATFORM_RISCV_XLEN=32 FW_TEXT_START=0x80000000 FW_JUMP_ADDR=0x80400000 FW_JUMP_FDT_ADDR=0x80800000

finally run in qemu:

qemu-system-riscv32 -machine virt -nographic -m 128M -bios openSBI/build/platform/generic/firmware/fw_jump.bin -kernel build/zephyr/zephyr_app.bin
OpenSBI v1.4
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|____/_____|
        | |
        |_|

Platform Name             : riscv-virtio,qemu
Platform Features         : medeleg
Platform HART Count       : 1
Platform IPI Device       : aclint-mswi
Platform Timer Device     : aclint-mtimer @ 10000000Hz
Platform Console Device   : uart8250
Platform HSM Device       : ---
Platform PMU Device       : ---
Platform Reboot Device    : syscon-reboot
Platform Shutdown Device  : syscon-poweroff
Platform Suspend Device   : ---
Platform CPPC Device      : ---
Firmware Base             : 0x80000000
Firmware Size             : 187 KB
Firmware RW Offset        : 0x20000
Firmware RW Size          : 59 KB
Firmware Heap Offset      : 0x26000
Firmware Heap Size        : 35 KB (total), 2 KB (reserved), 9 KB (used), 24 KB (free)
Firmware Scratch Size     : 4096 B (total), 184 B (used), 3912 B (free)
Runtime SBI Version       : 2.0

Domain0 Name              : root
Domain0 Boot HART         : 0
Domain0 HARTs             : 0*
Domain0 Region00          : 0x00100000-0x00100fff M: (I,R,W) S/U: (R,W)
Domain0 Region01          : 0x10000000-0x10000fff M: (I,R,W) S/U: (R,W)
Domain0 Region02          : 0x02000000-0x0200ffff M: (I,R,W) S/U: ()
Domain0 Region03          : 0x80020000-0x8002ffff M: (R,W) S/U: ()
Domain0 Region04          : 0x80000000-0x8001ffff M: (R,X) S/U: ()
Domain0 Region05          : 0x0c400000-0x0c5fffff M: (I,R,W) S/U: (R,W)
Domain0 Region06          : 0x0c000000-0x0c3fffff M: (I,R,W) S/U: (R,W)
Domain0 Region07          : 0x00000000-0xffffffff M: () S/U: (R,W,X)
Domain0 Next Address      : 0x80400000
Domain0 Next Arg1         : 0x80800000
Domain0 Next Mode         : S-mode
Domain0 SysReset          : yes
Domain0 SysSuspend        : yes

Boot HART ID              : 0
Boot HART Domain          : root
Boot HART Priv Version    : v1.12
Boot HART Base ISA        : rv32imafdch
Boot HART ISA Extensions  : sstc,zicntr,zihpm
Boot HART PMP Count       : 16
Boot HART PMP Granularity : 2 bits
Boot HART PMP Address Bits: 32
Boot HART MHPM Info       : 16 (0x0007fff8)
Boot HART MIDELEG         : 0x00001666
Boot HART MEDELEG         : 0x00f0b509

*** Booting Zephyr OS build v3.6.0-9172-g3f3978d9d9ff ***
[00:00:00.000,000] <dbg> main: main: started user app() 487399
[00:00:00.000,000] <inf> main: loop 0
con-pax commented 2 months ago

Hey @tswaehn a quick update. I thought I would update here instead of spamming the PR. I have managed to get SMP boot in Supervisor mode (with external, software and timer interrupts) on hardware (again, the PolarFire SoC Icicle Kit). I have rebased and pushed the commits to the previous branch. I implemented the start hart SBI call and it works a charm! So there actually is no real need to read m/s/uhartid, as the SBI start hart puts the booting hart's id into a0 on return.

I am working on the Qemu side of things now. Zephyr's toolchain cannot compile openSBI as it requires Position Independent Execution (-fPIE, -pie). HOWEVER, openSBI provides binaries as assets hanging off their GitHub repository. These could possibly be pulled in as a module or blob (or the repo could be forked into ZephyrProject, whatever is the correct thing to do). Anyway, as I see it, that aspect will have to be a separate PR. in fact, I would imagine getting this support will end up being 3/4 PR's 1 for testing (OpenSBI as module, west manifest change) 2 for including the SBI interface 3 for implementing 4 board support, etc

Here is the output of SMP boot in S mode:

OpenSBI v1.2
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|____/_____|
        | |
        |_|

Platform Name             : Microchip PolarFire(R) SoC
Platform Features         : medeleg
Platform HART Count       : 5
Platform IPI Device       : aclint-mswi
Platform Timer Device     : aclint-mtimer @ 1000000Hz
Platform Console Device   : mmuart
Platform HSM Device       : mpfs_hsm
Platform PMU Device       : ---
Platform Reboot Device    : mpfs_reset
Platform Shutdown Device  : mpfs_reset
Firmware Base             : 0xa000000
Firmware Size             : 144 KB
Runtime SBI Version       : 1.0

Domain0 Name              : root
Domain0 Boot HART         : 1
Domain0 HARTs             : 1,2,3,4
Domain0 Region00          : 0x0000000002008000-0x000000000200bfff (I)
Domain0 Region01          : 0x0000000002000000-0x0000000002007fff (I)
Domain0 Region02          : 0x000000000a000000-0x000000000a03ffff ()
Domain0 Region03          : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address      : 0x0000000080000000
Domain0 Next Arg1         : 0x000000000a02534a
Domain0 Next Mode         : S-mode
Domain0 SysReset          : yes

Domain1 Name              : zephyr
Domain1 Boot HART         : 1
Domain1 HARTs             : 1*,2*,3*,4*
Domain1 Region00          : 0x000000000a000000-0x000000000a03ffff ()
Domain1 Region01          : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain1 Next Address      : 0x0000000080000000
Domain1 Next Arg1         : 0x000000000a02534a
Domain1 Next Mode         : S-mode
Domain1 SysReset          : yes

Boot HART ID              : 1
Boot HART Domain          : zephyr
Boot HART Priv Version    : v1.10
Boot HART Base ISA        : rv64imafdc
Boot HART ISA Extensions  : none
Boot HART PMP Count       : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 36
Boot HART MHPM Count      : 2
Boot HART MIDELEG         : 0x0000000000000222
Boot HART MEDELEG         : 0x000000000000b109
*** Booting Zephyr OS build v3.7.0-783-g7ccbee350c0f ***
Calculate first 240 digits of Pi independently by 16 threads.
Pi value calculated by thread #0: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #1: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #2: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #3: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #4: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #5: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #6: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #7: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #8: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #9: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #10: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #11: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #12: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #13: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #14: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
Pi value calculated by thread #15: 314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316
All 16 threads executed by 4 cores in 41 msec

For anyone else thats interested, the commits for the above are here: https://github.com/polarfire-soc/zephyr/commits/wip-zephyr-smode/

tswaehn commented 2 months ago

regarding PIE I do have an open request. if accepted it would fix the building for qemu situation.

whereas building directly from openSBI repo for qemu would be possible. done this here.

con-pax commented 2 months ago

Hey @tswaehn another positive update: using the opensbi firmware binaries found here: https://github.com/riscv-software-src/opensbi/releases/download/v1.5.1/opensbi-1.5.1-rv-bin.tar.xz I am able to run the synchronization example

here is my command:

/mnt/dev/z_workspace$ qemu-system-riscv64 -machine virt -nographic -m 256 -bios opensbi-1.5.1-rv-bin/share/opensbi/lp64/generic/firmware/fw_jump.bin -kernel build/zephyr/zephyr.elf

OpenSBI v1.5.1
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|____/_____|
        | |
        |_|

Platform Name             : riscv-virtio,qemu
Platform Features         : medeleg
Platform HART Count       : 1
Platform IPI Device       : aclint-mswi
Platform Timer Device     : aclint-mtimer @ 10000000Hz
Platform Console Device   : semihosting
Platform HSM Device       : ---
Platform PMU Device       : ---
Platform Reboot Device    : syscon-reboot
Platform Shutdown Device  : syscon-poweroff
Platform Suspend Device   : ---
Platform CPPC Device      : ---
Firmware Base             : 0x80000000
Firmware Size             : 327 KB
Firmware RW Offset        : 0x40000
Firmware RW Size          : 71 KB
Firmware Heap Offset      : 0x49000
Firmware Heap Size        : 35 KB (total), 2 KB (reserved), 10 KB (used), 23 KB (free)
Firmware Scratch Size     : 4096 B (total), 408 B (used), 3688 B (free)
Runtime SBI Version       : 2.0

Domain0 Name              : root
Domain0 Boot HART         : 0
Domain0 HARTs             : 0*
Domain0 Region00          : 0x0000000000100000-0x0000000000100fff M: (I,R,W) S/U: (R,W)
Domain0 Region01          : 0x0000000002000000-0x000000000200ffff M: (I,R,W) S/U: ()
Domain0 Region02          : 0x0000000080040000-0x000000008005ffff M: (R,W) S/U: ()
Domain0 Region03          : 0x0000000080000000-0x000000008003ffff M: (R,X) S/U: ()
Domain0 Region04          : 0x000000000c400000-0x000000000c5fffff M: (I,R,W) S/U: (R,W)
Domain0 Region05          : 0x000000000c000000-0x000000000c3fffff M: (I,R,W) S/U: (R,W)
Domain0 Region06          : 0x0000000000000000-0xffffffffffffffff M: () S/U: (R,W,X)
Domain0 Next Address      : 0x0000000080200000
Domain0 Next Arg1         : 0x0000000082200000
Domain0 Next Mode         : S-mode
Domain0 SysReset          : yes
Domain0 SysSuspend        : yes

Boot HART ID              : 0
Boot HART Domain          : root
Boot HART Priv Version    : v1.10
Boot HART Base ISA        : rv64imafdch
Boot HART ISA Extensions  : zicntr
Boot HART PMP Count       : 16
Boot HART PMP Granularity : 2 bits
Boot HART PMP Address Bits: 54
Boot HART MHPM Info       : 0 (0x00000000)
Boot HART Debug Triggers  : 0 triggers
Boot HART MIDELEG         : 0x0000000000001666
Boot HART MEDELEG         : 0x0000000000f0b509
*** Booting Zephyr OS build v3.7.0-2103-g906c6d99dea2 ***
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!
thread_a: Hello World from cpu 0 on qemu_riscv64!
thread_b: Hello World from cpu 0 on qemu_riscv64!

So the good news is for single core applications, using the firmware provided by opensbi works fine. The bad news is that for SMP, the generic platform, opensbi uses a lottery for the boot hart, which is no good for Zephyr. I have tested multiple times and each time, the boot hart in opensbi is different.

as always, my commits are here: https://github.com/polarfire-soc/zephyr/commits/wip-zephyr-smode/