wokwi / wokwi-features

Wokwi Feature requests & Bug Reports
https://wokwi.com
69 stars 17 forks source link

Improve support for Zephyr RTOS applications #516

Open kartben opened 1 year ago

kartben commented 1 year ago

The zephyr-esp32 builder is already a great start in allowing to run Zephyr apps from Wokwi. It would be great to look into generalizing/improving support by allowing the users to configure things such as:

I am around to chat more around this, and I will provide help as much as I can!

urish commented 1 year ago

Hi Benjamin! Thanks for offering your help here. Zephyr apps is definitely something I'd love to see running in Wokwi.

Right now, we have three ways to use Wokwi:

  1. Online, though the web interface - very popular, but doesn't like long build times or complex build environments
  2. As a Visual Studio Code extension - in public beta, and you're already familiar with it
  3. As a command-line tool for CI use cases - in private beta, but we plan to open it publicly later this year

Which use cases do you have in mind?

kartben commented 1 year ago

I would love to make sure #1 gets improved, at least to the point where it allows people to get a better understanding of how they can tinker with things like device tree overlays, and project configuration. Point taken though regarding long build time possibly being a problem. I need to play more with the VS Code extension to get a sense of how the current experience is. It might be that it just works perfectly already :-)

Cheers!

urish commented 1 year ago

How complex is it to set up a Zephyr ESP32 "Hello world" project in VS Code?

kartben commented 1 year ago

How complex is it to set up a Zephyr ESP32 "Hello world" project in VS Code?

Not too complex, apparently :)

image
urish commented 1 year ago

Damn, you are fast!

urish commented 1 year ago

Is there a repo with this setup?

urish commented 1 year ago

@kartben heads up on wokwi-cli, which will allow running any project you set up for VS Code in CI. Want to give it a go?

kartben commented 1 year ago

@kartben heads up on wokwi-cli, which will allow running any project you set up for VS Code in CI. Want to give it a go?

Yes, please! Is everything I'd need available at the linked URL? And I need to do a proper write-up of some of my experiments to date with the VS Code extension.

urish commented 1 year ago

Yes, please! Is everything I'd need available at the linked URL?

It should - but this is pretty early, so you'll probably find something I forgot. Ping me when you do.

And I need to do a proper write-up of some of my experiments to date with the VS Code extension.

Yes, please!

bmeisels commented 1 year ago

wokwi-cli is really easy to use. I got it running here https://github.com/bmeisels/wokwi-cli-github-actions-example and @urish even added a dedicated action https://github.com/wokwi/wokwi-ci-action.

urish commented 11 months ago

@kartben which app is this? can you please share the repo?

image
kartben commented 11 months ago

@kartben which app is this? can you please share the repo?

image

I don't remember 😭 I really need to put together a quick README somewhere...

urish commented 10 months ago

People keep asking about Zephyr + Wokwi. We need your README :)

kartben commented 9 months ago

@kartben which app is this? can you please share the repo?

image

@urish ok so I am finally spending some time digging this up. Here's a working, self-contained, project: https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32c3-lvgl-with-mpu6050

Things are unfortunately really slow, and it's not clear why. I vaguely remember discussions about performance issues depending on how sleep is implemented, and also the ILI9341 simulation potentially being slow. I will try a similar sample with the RP2040, just in case I have better luck. In the mean time, your input on how to potentially improve the performance would be most welcome!

urish commented 9 months ago

Oh yeah!

image

In general, transferring huge amount of data over SPI is slow. There's an experimental way to short-circuit the low level SPI stuff and speed things up. Also, DMA can help to some extent - do you know of the underlying code uses DMA?

kartben commented 9 months ago

Oh yeah!

w00t!

In general, transferring huge amount of data over SPI is slow. There's an experimental way to short-circuit the low level SPI stuff and speed things up. Also, DMA can help to some extent - do you know of the underlying code uses DMA?

Ya but I think it's more than just SPI, let me try to produce a minimal repro sample

urish commented 9 months ago

Meanwhile, I ran some benchmarks using the Wokwi Profiler. Seems like most of the ESP32 time is spent memcpy'ing and doing SPI stuff:

image

kartben commented 9 months ago

Nice (the profiler I mean, not the fact that LVGL is probably a huge performance killer :D). I can't easily run it myself tho, right? From the Web UI it tries to run from source, and I don't think I can launch it from VS Code?

Please see https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32c3-blinky It should blink an LED every 1sec, but looks like the timing is off?

...
load:0x403d0000,len:0x2370
entry 0x403ce000
W (280) rtc_init: o_code calibration fail

*** Booting Zephyr OS build v3.5.0-rc1-11-g175560ff7e50 ***
LED state: ON @ 0
LED state: OFF @ 2129
LED state: ON @ 4258
LED state: OFF @ 6387

Could the rtc_init warning be relevant?

kartben commented 9 months ago

currently testing with esp32s3 and the timing for blinky seems correct

urish commented 9 months ago

Thanks! Yeah, the profiler is not available (yet) in VS Code, but I'm working on it!

Please see https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32c3-blinky

Good catch, pushed a fix. Can you please test again?

kartben commented 9 months ago

Please see https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32c3-blinky

Good catch, pushed a fix. Can you please test again?

Yay! Much better :) it looks like it speed things up a tiny bit for the lvgl sample, but that might be magical thinking. I can't seem to get SPI to work for ESP32S2 and S3 (pretty sure I'm just doing something wrong on the Zephyr side) but before I invest more time in trying to debug this, is there a chance that Xtensa simulation will be faster than RISC-V's?

urish commented 9 months ago

Great!

It's hard to tell whether Xtensa is going to be faster - it depends on the compiler and esp-idf. For instance, it might use DMA on the S2 (or original ESP32), and that could considerably speed things up.

When you pause the simulation with S2/S3, do you see the SPI pins configured correctly?

image

urish commented 9 months ago

Another update: got rid of the "rtc_init: o_code calibration fail" issue, so now the program should start much faster for C3 and S3 (it won't keep polling the RTC until it time outs).

kartben commented 9 months ago

Another update: got rid of the "rtc_init: o_code calibration fail" issue, so now the program should start much faster for C3 and S3 (it won't keep polling the RTC until it time outs).

very nice -- much faster indeed!

kartben commented 9 months ago

When you pause the simulation with S2/S3, do you see the SPI pins configured correctly?

ooh, what a neat feature :) So I am pretty sure things are wired up properly, and it looks like the pins are also OK, see below. I've put the non-working sample here https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32s3-lvgl-with-mpu6050, in case you can maybe spot anything obvious when tracing. Note that Zephyr+I2C+MPU6050 seems to work just fine for this same diagram, it's SPI that seems busted. Thanks!!

image
urish commented 9 months ago

Thanks for making it easy to reproduce - I found out it was a bug in the simulator (SPI hardware was not signaling the code it finished transferring data correctly). Uploaded a fix - I can see it now draws on the screen, but then the program (running in the sim) seems to crash with a exception right after it finishes drawing. Is that expected?

Reading symbols from ./esp32s3-lvgl-with-mpu6050/zephyr.elf...
Remote debugging using localhost:3333
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
_DoubleExceptionVector () at /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S:444
444     /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S: No such file or directory.
(gdb) bt
#0  _DoubleExceptionVector () at /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S:444
#1  0xffffffff in ?? ()
kartben commented 9 months ago

Thanks for making it easy to reproduce - I found out it was a bug in the simulator (SPI hardware was not signaling the code it finished transferring data correctly). Uploaded a fix - I can see it now draws on the screen, [...]

cool, works for me too! Thanks!

but then the program (running in the sim) seems to crash with a exception right after it finishes drawing. Is that expected?

Reading symbols from ./esp32s3-lvgl-with-mpu6050/zephyr.elf...
Remote debugging using localhost:3333
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
_DoubleExceptionVector () at /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S:444
444     /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S: No such file or directory.
(gdb) bt
#0  _DoubleExceptionVector () at /Users/kartben/zephyrproject/zephyr/arch/xtensa/core/xtensa-asm2-util.S:444
#1  0xffffffff in ?? ()

mmmh no (granted I don't have actual hardware to test) :) here's the line if that helps https://github.com/zephyrproject-rtos/zephyr/blob/ecefcd7f87d73bbd7d472399127421639d5bc2c2/arch/xtensa/core/xtensa-asm2-util.S#L444

On my side I can't seem to get it to crash and it rather looks like it's stuck? Simulation rate drops to ~15%, and I can't seem to get a crashdump?

urish commented 9 months ago

Yeah, same here - it it stuck in the _DoubleExceptionVector handler. Not sure why simulation speed drops to 15% in this case, but the question is what happened just before it got to _DoubleExceptionVector.

kartben commented 9 months ago

Yeah, same here - it it stuck in the _DoubleExceptionVector handler. Not sure why simulation speed drops to 15% in this case, but the question is what happened just before it got to _DoubleExceptionVector.

This time the problem was on my side! Increased stack size of the app and now it works! 🥳 I've updated the binaries accordingly.

urish commented 9 months ago

Awesome!

Meanwhile, I also pushed a prototype of short-circuit the SPI to slightly speeds things up. You can enable it by setting the "__labs_spiAccel" attr to "1", e.g.

    {
      "type": "board-esp32-s3-devkitc-1",
      "id": "esp",
      "top": -28.98,
      "left": -120.23,
      "attrs": {"__labs_spiAccel": "1"}
    },
kartben commented 9 months ago

@urish OK, now... UARTs :D

kartben commented 9 months ago
 {
      "type": "board-esp32-s3-devkitc-1",
      "id": "esp",
      "top": -28.98,
      "left": -120.23,
      "attrs": {"__labs_spiAccel": "1"}
    },

oh, neat! This does make a noticeable difference :) Note that I have tried to enable DMA in the SPI driver but this does not seem to work, I would push a repro but I am not actually sure I am setting things up correctly so don't want to waste your time.

urish commented 9 months ago

doesn't look like the simulator is using the same UI component for the console

add

  "serialMonitor": { "display": "terminal" },

to diagram.json to get the same terminal as in the C3.

However, this may give us a cue to where the problem is - having more than one character in the RX FIFO at the same time might cause the freeze

kartben commented 9 months ago

doesn't look like the simulator is using the same UI component for the console

add

  "serialMonitor": { "display": "terminal" },

to diagram.json to get the same terminal as in the C3.

Oh nice, I didn't realize that I probably inherited this when starting from an existing C3 sample, I guess. Not sure if it would make sense to make this the default in the starter templates? (speaking of which, ESP32-S3 is still mentioned as "beta" in the title bar for https://wokwi.com/projects/new/esp32-s3, not sure this is intentional).

However, this may give us a cue to where the problem is - having more than one character in the RX FIFO at the same time might cause the freeze

Something on the Wokwi side then, yes? Let me if there is any tweaks to the binary I could make to help you troubleshoot.

urish commented 9 months ago

Alright, I think I spotted the issue with the esp32-c3, some subtle bug with interrupt handling. Pushed a fix. Can you please give try now and report?

Still investigating the esp32-s3 one

kartben commented 9 months ago

Alright, I think I spotted the issue with the esp32-c3, some subtle bug with interrupt handling. Pushed a fix. Can you please give try now and report?

Awesome! I can't seem to be able to crash it, no matter how hard I try :) Very cool! (updated the binaries as I had mentioned it had VT100 colors supported but it turns out it wasn't enabled in the version I had pushed to the repo)

Still investigating the esp32-s3 one

❤️

urish commented 9 months ago

Looks like the esp32-s3 issue also has to do with interrupts!

urish commented 9 months ago

pushed a fix for the esp32-s3 interrupt issue. Cool to see the VT100 colors!

image

kartben commented 9 months ago

pushed a fix for the esp32-s3 interrupt issue. Cool to see the VT100 colors!

image

Niiice! I can confirm that this seems to solve the problem for me as well 🚀

kartben commented 9 months ago

BTW how would I go about trying to refresh some of the work from @beriberikix's with the zephyr builder, i.e. how to test things "locally"? I would like to see if we could get the .overlay and .conf files directly editable from the wokwi web workspace (which right now the first blocker would be that those file extension are not supported, I guess).

image
urish commented 9 months ago

Here's how to test things locally

As a workaround for unsupported file extensions, you could just name them something.c (maybe also something.txt), and create a shell script inside the builder that would rename them before running the build (that's how we prototyped the Rust builders when we first introduced Cargo.toml, which was unsupported back at the time).

Word of caution though - from what I have seen, Zephyr builds tend to take a lot of time. We won't be able to afford running builds that take several minutes in scale, unless we get some sponsorship for that. Also, most users would not have patience to wait several minutes watching a spinner.

beriberikix commented 9 months ago

LMK how I can help! Zephyr's official containers are, uh, bloated. The latest CI container is ~12gb. My experiments have yielded an ESP32 container @ 1.3gb and an ESP32S3 @ 1.3gb. They're faster as well.

kartben commented 9 months ago

@urish I've just added a few RP2040 samples :) Basic helloworld and blinky work just fine (yay!) but the "shell" sample seems stuck.

urish commented 9 months ago

Did you commit the shell sample? Couldn't find it in the repo

kartben commented 9 months ago

Did you commit the shell sample? Couldn't find it in the repo

hadn't pushed :) https://github.com/kartben/wokwi-zephyr-projects/tree/master/rpi_pico-shell_module

kartben commented 9 months ago

More goodness with ESP32 and actual Wi-Fi connection :) (https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32s3-wifi)

image
urish commented 9 months ago

Damn you are fast!

I can see that it hangs in the UART RX code in Pi Pico shell, investigating:

(gdb) bt
#0  0x1000a826 in uart_rpi_irq_rx_ready (dev=<optimized out>)
    at /home/kartben/zephyrproject/zephyr/drivers/serial/uart_rpi_pico.c:301
#1  0x100036d8 in uart_irq_rx_ready (dev=0x1000b478 <__device_dts_ord_44>)
    at /home/kartben/zephyrproject/zephyr/include/zephyr/drivers/uart.h:1041
#2  uart_callback (dev=0x1000b478 <__device_dts_ord_44>, user_data=0x1000bd64 <shell_transport_uart_shell_uart>)
    at /home/kartben/zephyrproject/zephyr/subsys/shell/backends/shell_uart.c:174
#3  0x1000a872 in uart_rpi_isr (dev=<optimized out>) at /home/kartben/zephyrproject/zephyr/drivers/serial/uart_rpi_pico.c:353
#4  0x1000597c in _isr_wrapper () at /home/kartben/zephyrproject/zephyr/arch/arm/core/cortex_m/isr_wrapper.S:117
#5  <signal handler called>
#6  arch_cpu_idle () at /home/kartben/zephyrproject/zephyr/arch/arm/core/cortex_m/cpu_idle.S:140
#7  0x1000aa9c in k_cpu_idle () at /home/kartben/zephyrproject/zephyr/include/zephyr/kernel.h:5842
#8  idle (unused1=<optimized out>, unused2=<optimized out>, unused3=<optimized out>)
    at /home/kartben/zephyrproject/zephyr/kernel/idle.c:89
#9  0x1000186c in z_thread_entry (entry=0x1000aa91 <idle>, p1=0x20000b58 <_kernel>, p2=0x0, p3=0x0)
--Type <RET> for more, q to quit, c to continue without paging--
    at /home/kartben/zephyrproject/zephyr/lib/os/thread_entry.c:48
#10 0xaaaaaaaa in ?? ()
urish commented 9 months ago

Pico shell Issue found, pushed a fix. Can you please try now?

kartben commented 9 months ago

Pico shell Issue found, pushed a fix. Can you please try now?

Works like a charm! 👏👏

kartben commented 9 months ago

@urish Another one for you if you're bored https://github.com/kartben/wokwi-zephyr-projects/tree/master/esp32s3-lvgl-with-encoder :) Seems like it crashes when pressing the button.