Closed mike-scott closed 5 years ago
I have a feeling more of the socket tests have this same issue.
Please note that all the programs under tests/net
are suppose to be self contained, they are not suppose to connect to outside world. The programs in samples/net/
are different in this respect as those ones will need outside connectivity to function properly.
That said, I tried to run this test for native_posix and it fails with this message
ASSERTION FAIL [net_if_get_link_addr(iface)->addr != ((void *)0)] @ zephyr/subsys/net/ip/net_if.c:2870
I will investigate what is going on here as the test is suppose to be run in native_posix according to its testcase.yaml file. I will also look what the test is actually doing or trying to do.
@jukkar can you try and run this test for qemu_x86? Without any setup of local dnsmasq it will PASS. Which is not correct. Take a look here at the assert checks: https://github.com/zephyrproject-rtos/zephyr/blob/master/tests/net/socket/getaddrinfo/src/main.c#L23
BTW, if you look at the asserts, there is clearly a bug there. Removing the bug
tag doesn't change this. Are we to ship LTS with incorrect tests?
Ugh. Re-reading this. I guess this test is written as a failure? Why does this even exist? Samples would do compile testing for sanity.
This test is a waste of CPU cycles.
I'll close, but this is silly.
To clarify why this test is silly: The whole reason I started investigating this, is that an actual DNS query sent via the Socket-based APIs which is never returned by the server will hang the Zephyr device. We have an edge gateway setup where containers are loaded which handle NAT64, DNS64, joining BLE 6lowpan devices as well as OpenThread. This issue was identified when the DNS64 container spins up a bit later than the BLE joiner. The perfect example of this would be when the gateway software is updated and it requires a restart, but the IoT nodes are up and running at the time.
The K_FOREVER semaphore taken here and a bit lower are the cause of the hang: https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/net/lib/sockets/getaddrinfo.c#L119
This test has no real network setup, so DNS subsys lib returns EAI_CANCELLED when the query cannot be sent. This test checks exact that: that a query cannot be sent and nothing else.
Lets not close this as there is an issue. As I said, I will investigate how to fix it.
There are too may assignees so setting only one. @mike-scott, please set only one assignee at a time as otherwise it is not clear who will start to fix the issue.
@jukka Apologies. I'm too used to PRs where that spot is for reviews. Thank you for taking a look here and at the new bug I opened.
This test has no real network setup, so DNS subsys lib returns EAI_CANCELLED when the query cannot be sent. This test checks exact that: that a query cannot be sent and nothing else.
Well, it tests at least something, specifically what it could (easily) test. And for what it tested, the logic is correct, so updating the ticket title.
Describe the bug Assert logic in https://github.com/zephyrproject-rtos/zephyr/blob/master/tests/net/socket/getaddrinfo/src/main.c#L23 is backwards. You can run this test without access to a local dnsmasq server and it passes.
To Reproduce
Expected behavior Test should fail if no local dnsmasq server is running
Impact Test is not working
Screenshots or console output
Environment (please complete the following information):