zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.64k stars 6.51k forks source link

RISC-V arch_swap() ecall usage for reschedule and a7 register value #39943

Closed berendo closed 2 years ago

berendo commented 2 years ago

@amergnat, @lindemer, @andrewboie, @nashif and others - thank you for your collective work on https://github.com/zephyrproject-rtos/zephyr/pull/27063. I am filing this as a "bug" but it likely qualifies more as a "question" at this time...

Describe the bug I have observed something "suspicious" (mostly through code dissection and a little bit of simulation effort) with regard to the way arch_swap()'s usage of an ecall to eventually reach the initial reschedule point in isr.S. If I walk through the set of events that lead to the very first thread context-switch, I observe that:

Looking at switch_to_main_thread(), I see z_swap_unlocked() which in the case of !CONFIG_USE_SWITCH configuration comes down to: https://github.com/zephyrproject-rtos/zephyr/blob/861787602eac5db5eb2b3aabe43eb7d2fb4831e0/kernel/include/kswap.h#L197-L200

and in there the very first invocation of arch_swap() takes place: https://github.com/zephyrproject-rtos/zephyr/blob/861787602eac5db5eb2b3aabe43eb7d2fb4831e0/kernel/include/kswap.h#L177-L185

Once arch_swap() makes the ecall (the first thing it does), the exception handler in isr.S is invoked and that quickly determines the exception cause to be the is_kernel_syscall variety: https://github.com/zephyrproject-rtos/zephyr/blob/861787602eac5db5eb2b3aabe43eb7d2fb4831e0/arch/riscv/core/isr.S#L400-L404

And now we get to the crux of my question... is_kernel_syscall, after checking the mepc to see if the true start or end of a syscall is being serviced, gets to a point where a7 (syscall number) is compared against FORCE_SYSCALL_ID to skip and jump forward to reschedule (which is what's desired in this very first arch_swap()): https://github.com/zephyrproject-rtos/zephyr/blob/861787602eac5db5eb2b3aabe43eb7d2fb4831e0/arch/riscv/core/isr.S#L439-L501

Alas, as far as I can tell, no deterministic value has been placed into a7 by this point (it just happens to have 0 in it due to the prior usage of that register) and things work, but it seems "fragile." It is entirely possible I am overlooking something, in which case I would be grateful for an explanation. Otherwise, I am wondering if prior to the ecall in arch_swap() a specific sentinel value should be placed into a7.

To Reproduce N/A

Expected behavior a7 should have a predictable/controlled value in all cases where an ecall can lead to a comparison against FORCE_SYSCALL_ID in isr.S when CONFIG_USERSPACE is enabled.

Impact N/A

Logs and console output N/A

Environment (please complete the following information): N/A

Additional context N/A

amergnat commented 2 years ago

Hello @berendo ,

All your content is right and I'm agree with you when you say it's "fragile". This solution has been kept because others (in my mind) are greedy on resources and this one is scalable (I used this scalability for another feature but I didn't pushed it for other reason).

"But", a second step is made by checking the pointer address function (in a0) to avoid arbitrary fixed value. If the check is wrong, then it will just reschedule like a7 != FORCE_SYSCALL_ID

Additionally: other known ecall from kernel (IRQ & IRQ Offload & Error) are already check before this test if I remember well.

To resume: to have a "kernel_syscall", you must pass 2 check after all other checks. I think that can be improved, but I don't know how without draw lot more resources.

Note: I wonder if there are some confusions with "is_kernel_syscall" tag. This one isn't rely to FORCE_SYSCALL_ID. "is_kernel_syscall" has been created to keep the existing code before the userspace support.

Hope that help and let me know if I misunderstood your question.

berendo commented 2 years ago

@amergnat thank you very much for your response and double-checking my analysis.

Regarding:

"But", a second step is made by checking the pointer address function (in a0) to avoid arbitrary fixed value. If the check is wrong, then it will just reschedule like a7 != FORCE_SYSCALL_ID

Right. You're referring to: https://github.com/zephyrproject-rtos/zephyr/blob/d5aa5c2a77f44ed7db227cbbf821275ad36c7524/arch/riscv/core/isr.S#L505-L507

which comes after the a7 comparison against FORCE_SYSCALL_ID. This, as you surmised, is what I was referring to as "fragile" because if anyone modifies that code in isr.S thinking that there was already a strong qualification of the syscall number, it may lead to very unexpected (and hard-to-debug) circumstances.

Why wouldn't simply slipping in a:

...
SECTION_FUNC(exception.other, arch_swap)

    /* Make a system call to perform context switch */
    li a7, RESCHEDULE_SYSCALL_ID
    ecall
...

sentinel value in a7 would not be considered a good trade-off of robustness vs a minuscule increase in instruction count in the arch_swap() cases?

amergnat commented 2 years ago

a7 isn't enough strong in my opinion. I don't understand why using a7 + a0 (double check) is "fragile" or "weaker". The current build is scalable, if someone have something to manage in ecall environment and come from machine privilege, he just need to add:

    la t0, my_new_function 
    bne t0, a0, reschedule 

As I said, I used it for other feature (not pushed).

This mechanism isn't for reschedule or swap, but to allow machine mode thread to become a user mode thread.

/* Function used by Zephyr to switch a supervisor thread to a user thread */
FUNC_NORETURN void arch_user_mode_enter(k_thread_entry_t user_entry,
    void *p1, void *p2, void *p3)
{
    arch_syscall_invoke5((uintptr_t) arch_user_mode_enter,
        (uintptr_t) user_entry,
        (uintptr_t) p1,
        (uintptr_t) p2,
        (uintptr_t) p3,
        FORCE_SYSCALL_ID);

    CODE_UNREACHABLE;
}

To be honest, I don't understand your point about arch_swap stuff.

berendo commented 2 years ago

a7 isn't enough strong in my opinion. I don't understand why using a7 + a0 (double check) is "fragile" or "weaker".

@amergnat I am not suggesting removing the "double check" - I am suggesting that ensuring that a7 always has a predictable value in addition to the "double check" is less fragile. I understand why/how FORCE_SYSCALL_ID is used to transition a thread to U-mode - the point I'm making is not about that case, but the case where arch_swap() does an ecall just to fall through the plain old reschedule case.

Here's where things get complicated for the current "double check" scheme: if one starts involving S-mode (as we are experimenting with) to add even simplistic MMU (as opposed to PMP) protection support, one has to handle some delegated exceptions (and interrupts) and there has to be both M-mode and S-mode exception handlers, with the main isr.S exception handler running in S-mode. Almost all ecalls specifically need to be delegated back to S-mode from M-mode, but also a handful of syscalls need to be carved out to be serviced purely in M-mode. If the syscall ID in a7 during an S-mode ecall cannot be relied upon, all of the "double check" logic needs to be replicated in both M-mode and S-mode exception handling. Furthermore, because M-mode runs in purely physical address domain, doing comparisons against (virtual) symbol addresses becomes even more cumbersome.

amergnat commented 2 years ago

@berendo OK I think I start to understand you point. Thanks for your explanations.

I though S-Mode will never have a chance to run under this RTOS because, in my opinion, it isn't fit and overkill for this kind of OS (but this is not the subject here, I'm probably wrong). The fact is what is done today isn't scalable for S-Mode, but if you have a tricks, go ahead.

I don't know if a7 is 100% predictable, that's why I've put a double protection. But I guess yes, if you force it in swap function. Sorry to don't help you more.

I'm curious: Do you plan to use sideleg and/or stvec to use another __irq_wrapper (only for S-Mode)?

github-actions[bot] commented 2 years ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.