olofk / serv

SERV - The SErial RISC-V CPU
ISC License
1.36k stars 178 forks source link

Port to Zephyr v3.5.0 + Fix System Timer #111

Closed m42uko closed 9 months ago

m42uko commented 9 months ago

I did a quick and dirty port to the latest Zephyr OS v3.5.0.

Here's the hello_world sample running on servant in verilator:

[markus@571aaf79e9b4 workspace]$ fusesoc run --target=verilator_tb servant --uart_baudrate=57600 --firmware=hello.hex
INFO: Preparing ::serv:1.2.1
INFO: Preparing ::servant:1.2.1
make: Nothing to be done for 'all'.
Loading RAM from /home/markus/workspace/hello.hex¨
*** Booting Zephyr OS build zephyr-v3.5.0 ***
Hello World! service

To make this cleaner, one should add the registers for the UART (and timer) to the device tree, and read them from there, but I see this as a separate clean-up commit rather than part of this port. If I have the time, I will implement that and open a separate PR for that once this one is merged. Also, we should probably be able to re-use/include some generic code from the RISC-V upstream repo, but again, I think this is a separate PR. Let me know what you think.

EDIT: Also fixes #112 by calling the timer init function using the SYS_INIT macro.

olofk commented 9 months ago

Fantastic! This has been on my wish list for a long time.

I haven't had time to look at your work yet, but regarding the msleep issue, there's one thing I would check first. At least in Zephyr 2.4.0, there's a bug that causes mul operations to be emitted in the library code even if the Kconfig settings says not to use hard mul and div. Not sure if this is the problem, but you can e.g. check the compiled zephyr.lst and see if there are any mul operations present.

Hope to take a closer look at your work in the coming days

olofk commented 9 months ago

This looks good to me. I did however come across a bug when running the dining philosophers demo.

fusesoc run --target=verilator_tb servant --uart_baudrate=55600 --memsize=65536 --firmware zephyr35_phil.hex 
INFO: Preparing ::serv:1.2.1
INFO: Preparing ::servant:1.2.1
make: Nothing to be done for 'all'.
Loading RAM from /home/olof/projects/serv/workspace/zephyr35_phil.hex
*** Booting Zephyr OS build zephyr-v3.5.0 ***
Demo Description
----------------
An implementation of a solution to the Dining Philosophers
problem (a classic multi-thread synchronization problem).
This particular implementation demonstrates the usage of multiple
preemptible and cooperative threads of differing priorities, as
well as dynamic mutexes and thread sleeping.

Philosopher 0 [C:-2]   EATING  [  02 ms ]  
ASSERTION FAIL [arch_irq_unlocked(key) || _kernel.cpus[0].current->base.thread_state & (((1UL << (0))) | ((1UL << (3))))] @ WEST_TOPDIR/zephyr/kernel/include/kswap.h:02
    Context switching while holding lock!
^C
Caught ctrl-c

INFO: ****************************
INFO: ****   FuseSoC aborted  ****
INFO: ****************************

I don't know if this is a new problem or if the checks in Zephyr have just become stricter, but I want to hold off merging this for a bit to investigate.

m42uko commented 9 months ago

Can confirm that there is a problem here. With the corresponding check removed, the demo runs, but somehow only two of the six philosophers seem to eat:

make: Nothing to be done for 'all'.
Loading RAM from /home/markus/workspace/phil.hex
IRQ @0xffeff688 | 0xffeff688
set irq = 0xfff5bdb4
set irq = 0xfff5bdb4
IRQ @0xfff5bdb5 | 0xfff5bdb4
*** Booting Zephyr OS build zephyr-v3.5.0 ***
set irq = 0xfffffc80
set irq = 0xfffffc80
IRQ @0xfffffc81 | 0xfffffc80
Philosopher 0 [C:-2]        STARVING       
Philosopher 0 [C:-2]    HOLDING ONE FORK   
Philosopher 0 [C:-2]   EATING  [  02 ms ] 
set irq = 0x00d609c0
set irq = 0x00d609c0
Philosopher 0 [C:-0]        STARVING       
Philosopher 0 [C:-0]    HOLDING ONE FORK   
Philosopher 2 [P: 0]        STARVING       
Philosopher 2 [P: 0]    HOLDING ONE FORK   
Philosopher 2 [P: 0]        STARVING       
Philosopher 2 [P: 0]    HOLDING ONE FORK   
Philosopher 0 [P: 2]        STARVING       
Philosopher 0 [P: 2]    HOLDING ONE FORK   
Philosopher 0 [P: 2]        STARVING       
IRQ @0x00d609c1 | 0x00d609c0
set irq = 0x00d61f40
set irq = 0x00d61f40
Philosopher 0 [C:-2]    DROPPED ONE FORK   
Philosopher 0 [C:-2]  THINKING [  02 ms ] 
set irq = 0x01408e80
set irq = 0x01408e80
Philosopher 0 [C:-0]   EATING  [  0g ms ] 
set irq = 0x0129cd80
set irq = 0x0129cd80
Philosopher 0 [P: 2]    HOLDING ONE FORK   
IRQ @0x0129cd81 | 0x0129cd80
set irq = 0x01408840
set irq = 0x01408840
Philosopher 0 [C:-0]    DROPPED ONE FORK   
Philosopher 0 [C:-0]  THINKING [  02 ms ] 
Philosopher 2 [P: 0]   EATING  [  08 ms ] 
IRQ @0x01408841 | 0x01408840
set irq = 0x0144a880
set irq = 0x0144a880
Philosopher 0 [C:-2] IRQ @0x0144a881 | 0x0144a880
       STARVING       
set irq = 0x01c01ec0
set irq = 0x01c01ec0
Philosopher 0 [C:-0]        STARVING       
IRQ @0x01c01ec1 | 0x01c01ec0
set irq = 0x01c03440
set irq = 0x01c03440
Philosopher 2 [P: 0]    DROPPED ONE FORK   
Philosopher 2 [P: 0]  THINKING [  08 ms ] 
set irq = 0x01d58800
set irq = 0x01d58800
Philosopher 0 [C:-0]    HOLDING ONE FORK   
IRQ @0x01d58801 | 0x01d58800

Philosopher 0 [C:-0]   EATING  [  0g ms ] 
Philosopher 2 [P: 0]   EATING  [  02 ms ] 
set irq = 0x01ef3700
set irq = 0x01ef3700
Philosopher 2 [P: 0]        STARVING       
IRQ @0x01ef3701 | 0x01ef3700
set irq = 0x02524340
set irq = 0x02524340
Philosopher 0 [C:-0]    DROPPED ONE FORK   
Philosopher 0 [C:-0]  THINKING [  02 ms ] 
IRQ @0x02524341 | 0x02524340
set irq = 0x0258dac0
set irq = 0x0258dac0
Philosopher 2 [P: 0]    DROPPED OIRQ @0x0258dac1 | 0x0258dac0
NE FORK   
Philosopher 2 [P: 0]  THINKING [  02 ms ] 
Philosopher 2 [P: 0]    HOLDING ONE FORK   
Philosopher 2 [P: 0]   EATING  [  08 ms ] 
Philosopher 0 [P: 2]   EATING  [  00 ms ] 
set irq = 0x028e7180
set irq = 0x028e7180
Philosopher 0 [C:-0]        STARVING       
IRQ @0x028e7181 | 0x028e7180
set irq = 0x03100880
set irq = 0x03100880
Philosopher 2 [P: 0]    DROPPED ONE FORK   
Philosopher 2 [P: 0]  THINKING [  08 ms ] 
set irq = 0x02bbc180
set irq = 0x02bbc180
Philosopher 0 [C:-0]    HOLDING ONE FORK   
Philosopher 0 [C:-0]   EATING  [  00 ms ] 
IRQ @0x02bbc181 | 0x02bbc180
set irq = 0x03100880
set irq = 0x03100880
Philosopher 2 [P: 0]        STARVING       
Philosopher 2 [P: 0]    HOLDING ONE FORK   
IRQ @0x03100881 | 0x03100880
set irq = 0x031f95c0
set irq = 0x031f95c0
Philosopher 0 [P: 2]    DROPPED ONE FORK   
Philosopher 0 [P: 2]  THINKING [  02 ms ] 
PIRQ @0x031f95c1 | 0x031f95c0
hilosopher 0 [C:-2]   EATING  [  0g ms ] 
set irq = 0x0361d840
set irq = 0x0361d840
Philosopher 2 [P: 0]        STARVING       
Philosopher 2 [P: 0]    HOLDING ONE FORK   
IRQ @0x0361d841 | 0x0361d840
set irq = 0x036dcec0
set irq = 0x036dcec0
Philosopher 0 [P: 2]        STARVING       
IRQ @0x036dcec1 | 0x036dcec0
set irq = 0x03dcfd40
set irq = 0x03dcfd40
Philosopher 0 [C:-0]    DROPPED ONE FORK   
Philosopher 0 [C:-0]  THINKING [  02 ms ] 
Philosopher 2 [P: 0]   EATING  [  08 ms ] 
IRQ @0x03dcfd41 | 0x03dcfd40
set irq = 0x03ea5800
set irq = 0x03ea5800
Philosopher 0 [C:-2]    DROPPED ONE FORK   
Philosopher 0 [P: 2]  THINKING [ IRQ @0x03ea5801 | 0x03ea5800
 0g ms ] 
Philosopher 0 [C:-2]    HOLDING ONE FORK   
Philosopher 0 [C:-2]   EATING  [  02 ms ] 
Philosopher 0 [P: 2]    HOLDING ONE FORK   

Like you said, hold off merging this until we fix this and the timer overflow issue.

m42uko commented 9 months ago

Interesting. On a tickful kernel, they never pick up or drop any forks at all. It just locks up after the initial prints. Something is very wrong here :see_no_evil:

olofk commented 9 months ago

I think we have several problems. I just realigned serv_timer to the upstream riscv_machine_timer from zephyr 3.5.0 and there were quite a few changes. But I also believe there are other problems in the RTL and/or the board support code. One thing I have knowingly cheated with in the RTL is to not make the MPIE bit in the MSTATUS register accessible from software. Perhaps this worked before, but not any more. I also remember that a lot of the files in the SERV-specific zephyr/soc directory had to be copied there because of some limitation in Zephyr when using an external soc dir. Maybe these files need to be refreshed (or best of all removed!) now.

m42uko commented 9 months ago

Oof, you're probably right. I just noticed that not even printf is working properly. It prints something like "0g0" when I try to use %d or %x. I feel like most of our issues at this point are not interrupt related even, but I have no idea how to get started troubleshooting this right now.

m42uko commented 9 months ago

I just noticed that not even printf is working properly.

Okay that problems seems to be because I still have MUL instructions in picolibc code. As far as I know, that library is precompiled and ships with their toolchain (I'm on v0.16.4).

olofk commented 9 months ago

I see. Perhaps it's easier in that case to just build Servant with the M extension enabled. First add the mdu library with fusesoc library add mdu https://github.com/zeeshanrafique23/mdu and then build with fusesoc run --flag=mdu ...

m42uko commented 9 months ago

Yup, that fixes this issue. So that's most likely an upstream bug.

olofk commented 9 months ago

Ok, that's a relief at least

m42uko commented 9 months ago

Oh, neat, that also improves the philosophers situation. It's still unhappy about the assertion, but once it is removed (in kernel/include/kswap.h), the test seems to run. OK, I'll take a break now and verify which other issues this resolves (maybe this also makes my 10-minute-timer problem go away.)

m42uko commented 9 months ago

Yup, this seems to also fix the 10-minute lockup (has been running for half an hour now with multiple overflows). So I'll have to apologize to @alinja, PR #109 seems to do what it promises, the overflow works. (Again, I didn't check that there aren't any weird timing artifacts, but the basic functionality looks good.) Log, 1 s sleep: hello_world_with_timer.log

olofk commented 9 months ago

Ok, so to sum it up, things seems to work just fine now with Zephyr 3.5.0, except for the assert crash? If so, I think we can merge it. That assert thing is likely another problem and should be analyzed separately.

m42uko commented 9 months ago

I agree, this should be good to merge.

I think it's a good idea to open an Issue regarding the assertion just to keep track of it, and the (possibly upstream) compiler issue I'll try to chase down later in the week.

olofk commented 9 months ago

I filed https://github.com/olofk/serv/issues/113 now to keep track of that issue. Thanks for working on this. Picked and pushed!