linux4wilc / firmware

Firmware binaries for Microchip ATWILC Wireless Devices (ATWILC1000 & ATWILC3000)
https://www.microchip.com/wwwproducts/en/ATWILC1000
Other
15 stars 9 forks source link

Various errors after Integrating wilc3000 with Allwiner A20 cpu (station only) #9

Closed goesterr closed 4 years ago

goesterr commented 5 years ago

Hello,

I have to integrate the wilc3000 on a quite old kernel (3.4.90) due to customers specification. The Interface is sdio, the processor is an Allwinner A20. I managed to backport the driver to run with that kernel in non-OOB mode with 50Mhz clock on the sdio, arranged handling of the reset/CE connection etc. and everthings ok up to and including connecting to one of the existent wlans, and starting some IP traffic like ping or wget/wput etc. But at a certain point (after minutes or hours) different errors occur leading to the driver to run endlessly in the debug-task, without any further successfull communication with the firmware any more, which reports at startup:

(0)MAC Hardware version : 0.1.04.0E.00 (1)MAC Firmware version : WILC_WIFI_FW_REL_15_2_1 Build: 11257 (1)RF Version: 01.1 (2)Built at May 14 2019 14:25:33

I can do nothing else but reboot then, since even removing the linux module (the driver is compiled as a module) gets stuck.

Oftenly these problems manifest in the firmware logging in lots of messages about misaligned/illegal isr-handle, or corrupted buffer, or allocation errors. Sometimes there is no error in the firmware logging, I just get failures in linux like "Get Timed out" for config packets, as if the firmware would stop answering, but it is continuing its normal periodic logging - I mean like: ar_stats_excl_dr0: 0 26 (0%)
x_stats: 13 67 (0%)

While operation seems normal, I also see the following unclear logging: ar_stats_excl_dr0: 0 31 (0%) rx_stats: 7 72 (9%) (22131)Err : 1554 30 5500 (22131)unable to DOZE: tbtt is sooner than the sleeping overhead ar_stats_excl_dr0: 0 26 (0%) rx_stats: 13 67 (19%)

I think, that parallelism of requests to sdio layer makes things worse, e.g. IP traffic and cfg-packets occuring due to the periodic statistics retrieval or switching the to another AP by the wpa-supplicant manually.

After working on this for 2 months now, I am quite desparate, also because of the increasing pressure from my customers side. The project expects to sell some thousand units per year.

What I did try: besides adding logging in the linux driver and the sdio-host code: different preemptive/non-preemptive versions of kernel compilation, adding a "wait-card-ready" sequence for the specific host controller, disabling the peridic static retrieval in the driver etc. etc. All this did not really change the situation.

I would be happy for any advice on how to proceed.

AdhamAbozaeid commented 5 years ago

Hi @goesterr The parallelism of requests to sdio layer should not be a problem since the driver protects the bus access. The log you added is also ok, it only means that at one point the FW didn't get enough time to sleep before the next beacon, so it stayed awake. One thing you can try is to disable powersave and see if that make a difference. If that didn't work, please open a ticket on salesforce and attach the driver and FW logs and the team will help you more there.

goesterr commented 5 years ago

Hi @AdhamAbozaeid Firstly: sorry for not answering quickly, I was busy with other parts of the project, among these performing a customer demo with voip over the atwilc3000, which I managed by reducing the sdio-clock to 5Mhz instead of 50. This 'cured' the issues I reported (though our board design is professional, but due to physical preconditions the sdio-bus is approx 1.5 inch). So your proposal about disabling powersave and other issues with my driver, concerning orderly restart after bus failures still are waiting for implementation. Meanwhile another topic needed ro be clarified as we also need bluetooth of the atwilc3000 for initial configuration by android/ios devices (e.g. setting the wlan-ssid out of a list and passing the PW).

After 4 days of digging through litterally tons of material, while desparately trying to pair the atwilc3000 as slave (peripheral), I stumbled upon your post https://github.com/linux4wilc/driver/issues/57#issuecomment-510574150 , where you clearly state that BT classic is not supported, which explained nearly all my experiences. Yes, I can lescan the surroundung, I also can set my own boardadress, make the module visible (advertise) by a self specified name (to win10 and android), and can also set ibeacons visible on androids BLE Scanner app. Problems always started when trying to pair or request ReadByGroup from these devices (tried several versions of bluez5).

With the new realisation of BLE-only I have the following question: The user guide DS70005238B on page 29 ff clearly refers to BT classic, e.g for scanning or the output of hciconfig -a on page 30/31 which stops in my case before the 'Name'-ouput with 'Unknown HCI Command'-status to the ReadLocalName-Command. How is this contradiction to be explained? Other sources also imply, to my current understanding, bluetooth classic support, like the rfcomm-samples in https://github.com/atwilc3000/sample/tree/master/Bluetooth. I initially planned to use rfcomm for our purpose. Are my assmumptions correct, that I will not be able to use rfcomm with the atwilc3000 but should try other BLE options (e.g. gatt)?

Thank you in advance for again clarifying things! WIth friendly greetings goesterr

AdhamAbozaeid commented 5 years ago

Hi Goesterr

BT classic is no longer supported by WILC3000, so your understanding is correct. I believe you are referring to an old version of the user guide, so please use the latest one from here: https://www.microchip.com/wwwproducts/en/ATWILC3000. Please also note that https://github.com/atwilc3000/ is also obsolete now as marked in its homepage. Hope that clarifies the contradiction for you.

For the SDIO errors, the fact that it worked for you with lower frequency points more to noisy or long tracks on the boards as you mentioned, making the lines more immune to noise at lower frequencies. Hence, I'm not very confident that my suggestion about disabling powersave would help, but it's still worth a try.