[BUG] certification test MAP-4.4.1_ETH fails on master

prplfoundation / prplMesh

This repository moved to https://gitlab.com/prpl-foundation/prplmesh/prplMesh

Other

65 stars 32 forks source link

[BUG] certification test MAP-4.4.1_ETH fails on master #850

Closed rmelotte closed 4 years ago

rmelotte commented 4 years ago

Here are the last UCC lines:

2020-02-19 02:15:55.898 - INFO - SNIFFER (192.168.250.5:9999) ---> sniffer_control_filter_capture,InFile,_MAP-4.4.1-BHETH-STEP5-8,OutFile,MAP-4.4.1-BHETH-STEP5,ethmac,10:0c:6b:a1:bd:6d,filterfile,mapfile,msgType,0x0007,nframes,last
109 2020-02-19 02:15:56.601 - INFO - MSG: No AP-Autoconfiguration Search message found

Logs available here: https://gitlab.com/prpl-foundation/prplMesh/-/jobs/442218413/artifacts/browse/logs/FAIL/MAP-4.4.1_ETH/

It's most probably another timing issue because:

the UCC does not wait after requesting the sniffer to start (and we know the sniffer reports COMPLETE before it actually starts sniffing)
it looks like the captured data begins after the autoconfig search message was sent by prplMesh
from prplMesh's logs, both the autoconfig search and response were sent and received

arnout commented 4 years ago

The sniffer is a VM or docker container running on the UCC laptop, right?

Perhaps we should just give that VM/container higher priority or nice value, so it has a better chance of actually starting before the UCC continues with the next command. Or is it not starting immediately because it blocks on something?

arnout commented 4 years ago

The other option is to insert small sleeps in the UCC command handling within prplmesh. But that feels a bit.. awkward...

tomereli commented 4 years ago

We had a similar issue with MAP-5.4.2. I looked for the mail (search for RE: 回覆： No AP-Autoconfiguration Renew message issue with test MAP-5.4.2_ETH_FH24G) and replied with this issue, asking for confirmation that adding the sleep to the UCC scripts as a default solution for such cases is indeed acceptable.

rmelotte commented 4 years ago

The sniffer is running directly on the laptop. tshark can take a second or 2 before it actually start sniffing. I'm not sure niceness or priorities would help in that case? It looks like what the UCC does in most cases is to wait just after the sniffer replied COMPLETE, though in this case it does not.

I wonder why we couldn't just change the sniffer scripts to wait 2 seconds itself before replying COMPLETE, which would remove the need to change any UCC script (it would wait two times in some cases, which shouldn't be a problem).

rmelotte commented 4 years ago

Since we have less control over the sniffer than we would need, this was fixed by adding another sleep for this specific case (commit 3fe134e75784bb1283897bfd6fe9317b94e7a05b in easymesh_cert).