mcknly / breadboard-os

A firmware platform aimed at quick prototyping, built around FreeRTOS and a feature-packed CLI
MIT License
531 stars 22 forks source link

WiFi Support #20

Open mcknly opened 4 months ago

mcknly commented 4 months ago

Adding basic WiFi support with CYW43+lwIP stack. Goals of this implementation:

mcknly commented 4 months ago

@Kintar here is the issue I created based on the work you are doing. The branch is 20-wifi-support

mcknly commented 4 months ago

Regarding your comments from the other discussion, which lwIP implementation are you using? I am trying to think forward to adding support for other MCU platforms - I know there is a lwIP library in the Pico SDK, but perhaps we should use the GitHub mirror as a submodule at or near the top level? Then any interface between lwIP <-> CYW43 would reside in hardware/rp2040.

In general, for this first implementation, I would say we pick a particular configuration and go with it, foregoing "flexibility" for now. That could come later as more features are added. If your end goal for now is getting MQTT working, do what works for that. We do however want the stack to run in FreeRTOS for sure - but within the servicemanager framework that we already have - i.e. add a new service called network or something, which will run the stack and have a control/message interface that can be interacted with from other services(tasks). This will allow us to have a network menu system in the CLI.

Let me know your thoughts! Also, feel free to send PRs to this branch at any time, don't need to fully bake anything yet, the above is Pie in the Sky and may take a minute to get there.

Kintar commented 4 months ago

@mcknly I think we should have a Discord conversation. I've run across some...oddities with the Pico SDK, lwIP, and FreeRTOS. If you want to shoot me an email at the address listed on my commit messages, let's find a time to talk.

The short version is that the lwip implementation included with the Pico SDK won't compile with the current mainline of FreeRTOS, and your CLI code won't compile with the Pico SDK's required version of FreeRTOS...or at least, I can't get them to play nicely together.

Kintar commented 4 months ago

@mcknly : You can ignore that last message from me. Looks like there's an update to one of the configuration options that was causing the issue.

mcknly commented 4 months ago

@Kintar I had a brief look at your WiFi feature branch, looks like good progress so far, your compilation issues are resolved now?

Kintar commented 4 months ago

@mcknly Yep. Gotta love how thoroughly Amazon documents stuff, don't you? I work with AWS daily for my "real" job, and their documentation practices are a constant source of irritation so I wasn't surprised to find Amazon is the maintainer of FreeRTOS. ;)

I'm thinking a service is the wrong way to go, now that I've been a little further into your existing code, and it would make more sense as an entry in the /dev node, with commands like:

wifi configure <ssid> <password> wifi connect wifi disconnect wifi clearconfig -- disconnect and clear configuration

Then cat wifi could return something like:

Connected to : <SSID>
IP Address : XXX.XXX.XXX.XXX

And we can expand from there if we need more data. This should be everything we need for basic connectivity and MQTT/Telnet implementation, though.

Thoughts?

mcknly commented 4 months ago

@Kintar could you expand more on

a service is the wrong way to go

Do you mean the stack should run outside of the servicemanager framework? Or outside of FreeRTOS completely?

In terms of your functions/commands - we are aligned, although in the future when we have more WiFi/networking/MQTT stuff, it may make sense to have a new CLI node (maybe net/?)

Kintar commented 4 months ago

@mcknly I meant that the stack should run outside of the servicemanager. When I first started looking at bbos, I was thinking WiFi would need to run as a service task, but the integration with freeRTOS is already provided in the Pico C SDK, so there's no need to implement a separate task thread for handling lwip callbacks and such.

100% agreed on your comment about a new CLI node in the future. For now I'm going to leave it in /dev unless you object.

Kintar commented 4 months ago

Well...it compiles and boots, but it now hangs when it tries to initialize the wifi chip. I've tried several configuration tweaks to no avail. I'm going to spin up a small FreeRTOS-based project and see if I can get it running without any of the other plumbing and maybe that will offer insight.

Kintar commented 4 months ago

Oh. My. God. >.< Okay, the solution is simple. Look for a commit in a few hours.

Kintar commented 4 months ago

Well...the solution I found worked in the pico FreeRTOS ping example, but not in BBOS. Still investigating.

Kintar commented 4 months ago

I'm so confused by this it's not even funny. I spent a great deal of time yesterday evening and today looking at this problem, and I can find no reasonable explanation. The issue is occurring in the cyw43_arch_init() function, and is related to the changes made to FreeRTOS-Kernel to support SMP on the mainline branch. After identifying all of the issues which prevented the pico SDK examples from compiling and running against the current main of FRTOS, I copied over the FRTOS config header and tried again with the same results; hard lockup during arch_init.

I then built a new FRTOS project from scratch and got it working and connecting on the pico_w, and tried copying the config file from bbos over to it...and it still works. O.o

At this point, I'm a at a loss as to what the issue could be. Something is interfering with the proper initialization of the cyw43 module, but I'm baffled as to what. I think my next move is going to be to start bringing microshell and the other subsystems from bbos over to the new project and see how long it takes me to break the new project. That might give a better hint as to the culprit.

If you have alternative suggestions @mcknly, I'm all ears.

Kintar commented 4 months ago

I FINALLY found it. There was a call into onboard_led_init that I'd missed!

glennswest commented 4 months ago

Congrats Those are a painSent from my iPhoneOn Jun 29, 2024, at 4:46 PM, Alec Lanter @.***> wrote: I FINALLY found it. There was a call into onboard_led_init that I'd missed!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

Kintar commented 4 months ago

SLOWLY, painfully getting closer. I'm now getting to the point of enabling station mode before everything locks up. At this point, I think I'm going to have to change the entire bootup sequence in order to validate what I'm doing. I know WiFi works with FreeRTOS, I've done it in two other from-scratch projects, plus the iperf and ping examples from the pico-examples project. There has to be something interfering. The thing that really baffles me is that the hardware watchdog isn't restarting the system, either, which is distressing.

glennswest commented 4 months ago

Hardware defines are all correct? And init for all the devices?Sent from my iPhoneOn Jun 29, 2024, at 10:11 PM, Alec Lanter @.***> wrote: SLOWLY, painfully getting closer. I'm now getting to the point of enabling station mode before everything locks up. At this point, I think I'm going to have to change the entire bootup sequence in order to validate what I'm doing. I know WiFi works with FreeRTOS, I've done it in two other from-scratch projects, plus the iperf and ping examples from the pico-examples project. There has to be something interfering. The thing that really baffles me is that the hardware watchdog isn't restarting the system, either, which is distressing.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Kintar commented 4 months ago

Expect a lack of communication from me for a few days. Starting a new sprint at work, and getting together the parts to do a proper debug probe setup.

mcknly commented 4 months ago

@Kintar I appreciate your tenacity on this. Proper debugger support should hopefully help tremendously. I apologize for not giving you more support with a second set of eyes (and a JLink), been super busy with family + prepping for a 1100 mile bike ride. That said, I will also be mostly offline for the next week and a half, out somewhere on my bicycle. Thanks again for your work! Will check in mid July!

Kintar commented 4 months ago

No worries, @mcknly, I know how it goes. I've been lucky to have this much time to dedicate over the past week, honestly.

I finally started over yesterday evening and created my own little "smol-os" project to prototype with. I also FINALLY found my other pico, so I now have a picoprobe set up for SWD.

Couple that with work today that involved lots of thumb-twiddling while waiting on cloud deployment pipelines to run (\<grin>), I now have a working FreeRTOS base system that will boot the pico_w, set up UART comms single-threaded, then launch a task to configure the cyw43 arch and join WiFi while another task blinks the onboard LED. I've learned the following:

  1. The cyw43 initialization call must happen from a FreeRTOS task if other WiFi (or, eventually Bluetooth) related calls will happen from a FreeRTOS task. I found this buried in documentation somewhere and like a moron didn't write down the location, so now I can't find it again. -.- If someone runs across it, please let me know where it is so I can update my comments accordingly.
  2. Launching a FreeRTOS task to start configuring the cyw43 immediately after boot and UART enablement causes the board to lock more often than not. I've added a 1500ms delay after UART config, and the app now starts correctly every time.
  3. There's some kind of issue with authenticating to a wifi network in the cyw43+lwip+freertos stack. First power-on of the pico always (so far, out of ~30 attempts) successfully joins to the wifi network. However, if I soft-reset the board or use the SWD connection to load new code and reset the chip, the first run of the code fails to join the WiFi with "bad auth" more often than not. (Probably 80% of the time.) This is true even when the newly-flashed code is the same binary as the previously-running code.
    • Resetting the board by grounding the RUN pin for <2s generally does not resolve the issue
    • Resetting the board by grounding the RUN pin for >2s almost always allows the board to successfully join WiFi.
  4. Attempting to initialize the cyw43 chip (via cyw43_init_XXXX calls) generally (but not always) locks the system if other RTOS tasks are running. Allowing the init call to complete before launching other tasks seems to resolve this issue.

So, I have enough information (and a working debug probe) now that I should be able to get back to implementing this in bbos once I have free time again...but this presents a couple of new issues.

First, the startup process of BBOS will need to be modified when using a pico_w. In order to monitor boot status, we'll need UART immediately after power-on, but we must then launch ONE and ONLY one RTOS task to start the WiFi initialization process. Once that task gets past the cyw43 init call, we can start creating other RTOS tasks and continue startup as normal.

Second, the mysterious "bad auth" errors worry me. I'm hopeful that there's just something screwy with my pico_w board, but it's hardly been used at all so that seems unlikely. I'd really like for other folks with access to a pico_w to try the code I'm using in my experiments and see if their experience is the same. If this really is something in the wifi stack, it means we'll need to put in a fairly robust retry system, and give lots of feedback to the end user.

Third...I've honestly forgotten what my third topic was because I had a meeting in the middle of typing the second point and have completely lost my train of thought. =)

As always, I'd love to hear other people's thoughts on this stuff. I'll get my experiment project cleaned up and link it in a comment as soon as I won't be embarrassed to have other people read it. :D

@mcknly : Good luck on your bike...race? Ride? Self-imposed penance tour? ;) Talk to you mid-July!

mcknly commented 4 months ago

Being an embedded hardware designer by trade this all sounds suspiciously like the peculiarities you might see when you have two MCUs that aren't playing nice together. Possibly a race condition at boot... State machines getting out of whack when one chip resets but the other does not.... Maybe the CYW43439 is still connected/authenticated after RP2040 reboot causing authentication errors? And then there's this interesting note in the datasheet:

Due to pin limitations, some of the wireless interface pins are shared. The CLK is shared with VSYS monitor, so only when there isn’t an SPI transaction in progress can VSYS be read via the ADC. The Infineon CYW43439 DIN/DOUT and IRQ all share one pin on the RP2040. Only when an SPI transaction isn’t in progress is it suitable to check for IRQs.

I haven't dug into the schematics yet...

mcknly commented 4 months ago

@Kintar

Self-imposed penance tour?

Nailed it. Voluntarily suffering planned with a small group of like-minded idiots.

fatdollar commented 4 months ago

@Kintar I have had a couple of busy weeks but I finally got time to pull down and test somethings or help out. I'll be looking at your pull request but I'm wondering: you have it labelled as NON FUNCTIONAL can you just summarize where its at?

Again I haven't looked at it yet but just wanted to get up to speed on where you're left it when you did the pull request.

Thanks. Great work btw.

Kintar commented 4 months ago

you have it labelled as NON FUNCTIONAL can you just summarize where its at?

The summary is in the text of the PR. The short version is that it work to connect to WiFi most of the time, but the RTOS system hangs when it starts to initialize the NVM system (littlefs, if I recall, not at my computer at the moment).

fatdollar commented 4 months ago

Oh gotcha I'll take a look thanks!

mcknly commented 3 months ago

@Kintar had a few brief moments to test the branch today. I see the issues you have mentioned - I can get it to connect at first powerup, but for subsequent boots I need to pull power completely and hard reboot the entire board. I've had a look at the Pico_W schematics (here), looks like there is a WL_ON signal on GPIO23 that should force a power cycle on the CYW43. Not sure if there is a function in the SDK for this. We should figure out the best way to toggle and force a reset at boot before wireless init. That will hopefully clear up some problems.

mcknly commented 3 months ago

Did some debugging. I added a function to toggle CYW43 GPIO23 to force POR, but this did not clear up the problems. Took a deeper look with my JLink, There is an issue where the RTOS scheduler is hanging at init - see below:

image
it is hanging on a multicore fifo data transfer (Pico SDK multicore.c file):
image

So, I disabled all the multicore/SMP stuff in FreeRTOSConfig.h, low and behold I can get it to connect at boot, and subsequently connect at every soft reboot without needing a hard power cycle. However - I was also running into a hard fault in the idle task after the WiFi connects - I increased the stack size for the WiFi task to 1024 and that seems to make it happy.

My thought is that anything multicore-related should be disabled for now, since the rest of the project does not leverage core1 at all.

Changes are pushed up.

Kintar commented 3 months ago

Oh, wow. That STINKS, because the project I'm working on that brought me to BBOS needs multicore support. :( It's also very strange, because the project I built to test WiFi+FreeRTOS without BBOS was working fine.

I really need to figure out why my GDB breakpoints have stopped working. I've mucked up something config-wise with my picoprobe, and just haven't had the time to figure out what.

Kintar commented 3 months ago

@mcknly : Did you do a squash-merge when you merged PR #22? I see commit 0e74b9 was merged when that PR was closed, and it's not something I have on my branch. Attempting to merge or rebase my 20-wifi-support with yours is throwing all sorts of conflicts, which I wouldn't have expected if this was a full merge of the PR.

Kintar commented 3 months ago

@mcknly : Okay, nevermind. I see what's going on. I'm just not used to the "github way" of merging PRs.

mcknly commented 3 months ago

@Kintar

Oh, wow. That STINKS, because the project I'm working on that brought me to BBOS needs multicore support.

Can you give any more detail on how you intend to use multicore? FreeRTOS on one? Both? With AMP or SMP? Maybe there is an easy fix. SMP with support for the RP2040 port should be in mainline and working these days, I've just never messed with it. I'm hoping that the fact it just hangs when trying to initialize with the SMP options means there is something easy config-wise that we are missing.

Kintar commented 3 months ago

FreeRTOS in SMP on both cores of a Pico. One core will be mainly concerned with an arbitrary waveform generator that's pinned to that core, while the other will be performing sensor-related tasks and dropping messages into a process queue to alter the output from the AWG.

I don't really need BBOS for this project, it was just going to be a convenient way to interact with the system while I'm tweaking parameters and doing performance profiling.

mcknly commented 3 months ago

it was just going to be a convenient way to interact with the system while I'm tweaking parameters and doing performance profiling

Exactly the purpose of the project! Let's figure this out. I will do my best to support. I really wanted to release with some form of multicore functionality but it was beyond scope at the time. Let me know if you are willing to discuss how to tackle this and we can break it into another issue.

Kintar commented 3 months ago

@mcknly I'm back from vacation, so absolutely. Let's talk about it!