adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.09k stars 1.22k forks source link

Can't re-start wifi AP on Pico W #8718

Closed anecdata closed 9 months ago

anecdata commented 10 months ago

CircuitPython version

Adafruit CircuitPython 9.0.0-alpha.6 on 2023-12-12; Raspberry Pi Pico W with rp2040

Code/REPL

import time
import os
import wifi
import traceback

AP_SSID = "Bob"
AP_PASSWORD = "YourUncle"
DELAY = 5

time.sleep(3)  # wait for serial
print("="*25)

# start clean
if wifi.radio.connected or wifi.radio.ipv4_address:
    wifi.radio.stop_station()
if wifi.radio.ap_active or wifi.radio.ipv4_address_ap:
    wifi.radio.stop_ap()
while True:
    print("Starting AP... ", end="")
    try:
        wifi.radio.start_ap(ssid=AP_SSID, password=AP_PASSWORD)
    except Exception as ex:
        traceback.print_exception(ex, ex, ex.__traceback__)
    time.sleep(0.1)
    print(f"{wifi.radio.ipv4_address_ap=}")
    time.sleep(DELAY)
    print("Stopping AP... ", end="")
    try:
        wifi.radio.stop_ap()
    except Exception as ex:
        traceback.print_exception(ex, ex, ex.__traceback__)
    time.sleep(0.1)
    print(f"{wifi.radio.ipv4_address_ap=}")
    time.sleep(DELAY)

Behavior

9.0.0-alpha.6 is the first release to have both #8590 and #8326, but there seems to be a regression. Under 8.2.8 and 8.2.9 (with 8590, but not with 8326), the code above runs fine. But under 9.0.0-alpha.6, I can't find a sequence that lets me start an AP a second time:

Adafruit CircuitPython 9.0.0-alpha.6 on 2023-12-12; Raspberry Pi Pico W with rp2040
>>> import microcontroller ; microcontroller.on_next_reset(microcontroller.RunMode.NORMAL) ; microcontroller.reset()

[17:19:41.986] Disconnected
[17:19:42.991] Warning: Could not open tty device (No such file or directory)
[17:19:42.991] Waiting for tty device..
[17:19:46.009] Connected
=========================
Starting AP... wifi.radio.ipv4_address_ap=192.168.4.1
Stopping AP... wifi.radio.ipv4_address_ap=None
Starting AP... Traceback (most recent call last):
  File "code.py", line 21, in <module>
RuntimeError: AP could not be started
wifi.radio.ipv4_address_ap=None

...ad infinitum

Description

No response

Additional information

Code runs fine on espressif, but note that a delay is needed after start_ap() for ipv4_address_ap to become active.

anecdata commented 10 months ago

I'm baffled.

8326 - merged 2 commits into adafruit:main from bill88t:picow-stop-ap on Aug 25

8590 - merged 1 commit into adafruit:8.2.x from eightycc:8138 on Nov 13

Release 9.0.0-alpha.6 release notes show

Port and board-specific changes RP2040 • Add wifi.stop_ap(). https://github.com/adafruit/circuitpython/pull/8622, https://github.com/adafruit/circuitpython/pull/8590

8622 does seem to include:

Implement stop_ap for RP2 1451d77

But there would have been some kind of merge conflict? My SDK knowledge is minimal, not sure what each of these are really doing, but features of both are needed: #8326 allows station and AP to run independently or concurrently and allows stopping AP (once). #8590 allows AP to be stopped in a way that it can be started again.

8590:

void common_hal_wifi_radio_stop_ap(wifi_radio_obj_t *self) {
    if (!common_hal_wifi_radio_get_enabled(self)) {
        mp_raise_RuntimeError(translate("wifi is not enabled"));
    }

    if (cyw43_tcpip_link_status(&cyw43_state, CYW43_ITF_AP) != CYW43_LINK_DOWN) {       // Disassociate from WLAN
        cyw43_wifi_leave(&cyw43_state, CYW43_ITF_AP);
        // Stop AP
        cyw43_wifi_set_up(&cyw43_state, CYW43_ITF_AP, false, 0);
        bindings_cyw43_wifi_enforce_pm();
    }
}

8326:

void common_hal_wifi_radio_stop_ap(wifi_radio_obj_t *self) {
    if (!common_hal_wifi_radio_get_enabled(self)) {
        mp_raise_RuntimeError(translate("wifi is not enabled"));
    }

    cyw43_arch_disable_ap_mode();

    const size_t timeout_ms = 500;
    uint64_t start = port_get_raw_ticks(NULL);
    uint64_t deadline = start + timeout_ms;
    while (port_get_raw_ticks(NULL) < deadline && (cyw43_tcpip_link_status(&cyw43_state, CYW43_ITF_AP) != CYW43_LINK_DOWN)) {
        RUN_BACKGROUND_TASKS;
        if (mp_hal_is_interrupted()) {
            break;
        }
    }

    bindings_cyw43_wifi_enforce_pm();
}

8622 1451d77 matches #8590:

EXPAND... ```c void common_hal_wifi_radio_stop_ap(wifi_radio_obj_t *self) { if (!common_hal_wifi_radio_get_enabled(self)) { mp_raise_RuntimeError(translate("wifi is not enabled")); } if (cyw43_tcpip_link_status(&cyw43_state, CYW43_ITF_AP) != CYW43_LINK_DOWN) { // Disassociate from WLAN cyw43_wifi_leave(&cyw43_state, CYW43_ITF_AP); // Stop AP cyw43_wifi_set_up(&cyw43_state, CYW43_ITF_AP, false, 0); bindings_cyw43_wifi_enforce_pm(); } } ```

...but 9.0.0-alpha.6 matches #8326:

EXPAND... ```c void common_hal_wifi_radio_stop_ap(wifi_radio_obj_t *self) { if (!common_hal_wifi_radio_get_enabled(self)) { mp_raise_RuntimeError(MP_ERROR_TEXT("wifi is not enabled")); } cyw43_arch_disable_ap_mode(); const size_t timeout_ms = 500; uint64_t start = port_get_raw_ticks(NULL); uint64_t deadline = start + timeout_ms; while (port_get_raw_ticks(NULL) < deadline && (cyw43_tcpip_link_status(&cyw43_state, CYW43_ITF_AP) != CYW43_LINK_DOWN)) { RUN_BACKGROUND_TASKS; if (mp_hal_is_interrupted()) { break; } } bindings_cyw43_wifi_enforce_pm(); } ```

...so the merge was dropped because it conflicted?

eightycc commented 10 months ago

Looks like #8326 isn't stopping the AP. That's what the call to cyw43_wifi_set_up(..., false, 0) in #8590 does.

eightycc commented 10 months ago

@anecdata You're right. We need elements of both for this to work correctly. I'll submit a pull after lunch, will you be willing to test it?

anecdata commented 10 months ago

Yes!

eightycc commented 10 months ago

Hmmm. Problem is deeper than I thought. The call to cyw43_arch_disable_ap_mode() in #8326 is calling cyw43_wifi_set_up(..., false, ...) so that's not the error. There's a recurring bit of code in #8326:

    while (port_get_raw_ticks(NULL) < deadline && (cyw43_tcpip_link_status(&cyw43_state, CYW43_ITF_STA) != CYW43_LINK_xxxx)) {
        RUN_BACKGROUND_TASKS;
        if (mp_hal_is_interrupted()) {
            break;
        }
    }

that looks like it delays but never actually times out if what it's waiting for fails to complete. Maybe related?

I'm setting up some debug tracing for a closer look now.

There is a TODO in common_hal_wifi_radio_start_ap() that describes just this problem:

    /* TODO: If the AP is stopped once it cannot be restarted.
     * This means that if if the user does:
     *
     * wifi.radio.start_ap(...)
     * wifi.radio.stop_ap()
     * wifi.radio.start_ap(...)
     *
     * The second start_ap will fail.
     */
eightycc commented 10 months ago

Prior to #8326 common_hal_wifi_radio_start_ap() was not testing cyw43_tcpip_link_status() after calling cyw43_arch_enable_ap_mode() so with #8590 applied it appeared that the AP could be started, stopped, and then started again successfully. #8326 adds testing of cyw43_tcpip_link_status() which returns CYW43_LINK_DOWN for the second start attempt, resulting in the RuntimeError: AP could not be started exception.

The CYW43_LINK_DOWN status is returned because the CYW43 driver's status flags for the AP interface indicates that it is up, but it has not received a link up indication from the CYW43439. To narrow this down I'm activating tracing inside the CYW43 driver which I now suspect is at fault.

I have updated the CYW43 driver to tag v1.0.2, but that does not resolve the problem.

eightycc commented 10 months ago

Tracing asynchronous events from the CYW43439 shows that it is reporting that the AP is up on the second start attempt. So, there's a status reporting problem in the CYW43 driver.

Starting AP... 
[    6409] ASYNC(0001,LINK,0,0,1)  // CYW43_EV_LINK, link is up
[    6411] ASYNC(0001,LINK,0,0,1)  // CYW43_EV_LINK, link is up
AP tcpip status: 3  // CYW43_LINK_UP
AP wifi status: 0
        ...wifi.radio.ipv4_address_ap=192.168.4.1
Stopping AP... 
[   11573] ASYNC(0000,LINK,0,4,1)  // CYW43_EV_LINK, link is down
        ...wifi.radio.ipv4_address_ap=None
Starting AP... 
[   16803] ASYNC(0001,LINK,0,0,1)  // CYW43_EV_LINK, link is up
[   16805] ASYNC(0001,LINK,0,0,1)  // CYW43_EV_LINK, link is up
AP tcpip status: 0  // CYW43_LINK_DOWN reported in error
AP wifi status: 0
[   17413] ASYNC(0000,LINK,0,4,1)
Traceback (most recent call last):
  File "code.py", line 21, in <module>
RuntimeError: AP could not be started
eightycc commented 10 months ago

Not a status reporting error in the CYW43 driver. The interface flags are getting clobbered due to the AP interface being initialized (possibly in error) after the CYW43439 has brought the link up. Here's what happens on the second AP start:

eightycc commented 10 months ago

There is a race condition in the CYW43 driver that is the root cause of this error. I cannot find a workaround, so I'll report it to the CYW43 driver project and we'll have to wait for them to issue a fix.

On the first start of the AP, the CYW43439 takes longer to bring the link up and signal asynchronously than it does on subsequent attempts. For the first AP start cyw43_wifi_setup() is able to complete CYW43_cb_tcpip_init() before the CYW43439 signals that the link is up, so NETIF_FLAG_LINK_UP does not get clobbered. On subsequent starts of the AP, the CYW43439 signals that the link is up almost immediately so CYW43_cb_tcpip_init() clears NETIF_FLAG_LINK_UP after it has already been set.

eightycc commented 10 months ago

Submitted pulls georgerobotics/cyw43-driver#107 and georgerobotics/cyw43-driver#108 to resolve this issue and allow the latest cyw43-driver to build in CircuitPython, respectively.

eightycc commented 10 months ago

@dhalbert Please tag this 3rd party. We will need to create a pull to update cyw43-driver once the above pulls are accepted.

anecdata commented 9 months ago

Still an issue in beta.0 Adafruit CircuitPython 9.0.0-beta.0 on 2024-01-27; Raspberry Pi Pico W with rp2040.

Test Code... ```py import time import os import wifi import traceback AP_SSID = "Bob" AP_PASSWORD = "YourUncle" ITERATIONS = 5 DELAY = 3 time.sleep(3) # wait for serial print("="*25) def connect(): while not wifi.radio.connected: try: wifi.radio.connect(os.getenv("WIFI_SSID"), os.getenv("WIFI_PASSWORD")) except Exception as ex: traceback.print_exception(ex, ex, ex.__traceback__) time.sleep(1) def start_station(): while wifi.radio.connected: print(".", end="") time.sleep(1) print("Starting station...") wifi.radio.start_station() print("Connecting...") connect() print(f" {wifi.radio.ipv4_address=}") time.sleep(DELAY) def stop_station(): print("Stopping Station... ", end="") try: wifi.radio.stop_station() except Exception as ex: traceback.print_exception(ex, ex, ex.__traceback__) print(f"{wifi.radio.ipv4_address=}") def start_ap(): while wifi.radio.ap_active: print(".", end="") time.sleep(1) print("Starting AP... ", end="") try: wifi.radio.start_ap(ssid=AP_SSID, password=AP_PASSWORD) except Exception as ex: traceback.print_exception(ex, ex, ex.__traceback__) print(f"{wifi.radio.ipv4_address_ap=}") time.sleep(DELAY) def stop_ap(): print("Stopping AP... ", end="") wifi.radio.stop_ap() print(f"{wifi.radio.ipv4_address_ap=}") # start clean if wifi.radio.connected or wifi.radio.ipv4_address: wifi.radio.stop_station() if (wifi.radio.ap_active) or (wifi.radio.ipv4_address_ap): wifi.radio.stop_ap() while True: print() # Station toggling print("-"*25) for _ in range(0, ITERATIONS): start_station() stop_station() # AP toggling print("-"*25) for _ in range(0, ITERATIONS): start_ap() stop_ap() # Station & AP alternating print("-"*25) for _ in range(0, ITERATIONS): start_station() stop_station() start_ap() stop_ap() # Station + AP print("-"*25) for _ in range(0, ITERATIONS): start_station() start_ap() stop_ap() stop_station() start_ap() start_station() stop_station() stop_ap() start_station() start_ap() stop_station() stop_ap() start_ap() start_station() stop_ap() stop_station() ```

Result...

...
Starting AP...      wifi.radio.ipv4_address_ap=192.168.4.1
Stopping AP...      wifi.radio.ipv4_address_ap=None
Starting AP...      Traceback (most recent call last):
  File "code.py", line 48, in start_ap
RuntimeError: AP could not be started
wifi.radio.ipv4_address_ap=None
...
eightycc commented 9 months ago

@anecdata We need to pick up our local cyw43-driver fork since we're waiting for the fix to be integrated upstream.

anecdata commented 9 months ago

Oh, I saw that fork change but thought it would be included with main. Is there a way to release with the driver fork, until the upstream change is accepted?

anecdata commented 9 months ago

This build works: https://github.com/adafruit/circuitpython/issues/8858#issuecomment-1917638782