prplfoundation / prplMesh

This repository moved to https://gitlab.com/prpl-foundation/prplmesh/prplMesh
Other
65 stars 32 forks source link

[BUG] Mediatek device stops beaconning after some time #1010

Closed rmelotte closed 4 years ago

rmelotte commented 4 years ago

The mediatek device used for the certification tests in CI sometimes stops beaconning completely.

When it happens, the error message is usually pretty specific. For example:

No Beacon frame found, check CTT Agent1 is operating on the configured channel

A reboot fixes it.

It happened twice in 2 days lately.

Example of a job where it happens: https://gitlab.com/prpl-foundation/prplMesh/-/jobs/483434203

rmelotte commented 4 years ago

Same issue here: https://gitlab.com/prpl-foundation/prplMesh/-/jobs/484791794/artifacts/browse/logs/FAIL/MAP-4.8.3_ETH_FH5GL/

In the UCC logs, there is always this (note the N/A reply):

INFO     MediatekAGT (127.0.0.1:9002) ---> dev_get_parameter,program,map,ruid,0x000000000001,ssid,Multi-AP-1,parameter,macaddr
INFO     MediatekAGT (127.0.0.1:9002) <-- status,COMPLETE,macaddr,N/A
rmelotte commented 4 years ago

The mediatek sigma agent issues the following command on the device:

wappctrl ra0 map get_macaddr Multi-AP-1 000000000001

This is supposed to get the MAC of the radio that has the SSID "Multi-AP-1" and the RUID "0x000000000001". The resulting MAC is supposed to be in /tmp/map_macaddr.txt on the device.

If the result is N/A, it could mean that either:

rmelotte commented 4 years ago

Fixed with commit 03bcb2050ecda14f850f167cc2b909e7bdabcf74 in the easymesh_cert repo: https://git.prpl.dev/prplmesh/wfa-certification/easymesh_cert/-/commit/03bcb2050ecda14f850f167cc2b909e7bdabcf74

The fix for the mediatek device is in commit 113c03ec23f4137c026dc3fbd838abbe1b07cb25: https://git.prpl.dev/prplmesh/wfa-certification/mediatek_ucc_agent/-/commit/113c03ec23f4137c026dc3fbd838abbe1b07cb25

rmelotte commented 4 years ago

Just for the record, here are 2 tests passing successively, the first one being run right after a test were the issue occurred (without any reboot):