Closed thetek42 closed 4 months ago
Do any of your problems happen, if you replace "example.com" with an IP address?
Yes. It does not matter if I use example.com or the corresponding IP address.
You are sure 100% about that (as in you have used IpAddr
and not a string that might still trigger DNS resolution somehow)?
Your example is way too big and I have difficulties understanding what exactly you are trying to achieve...
This seems very odd though:
let netif_ap = EspNetif::new_with_conf(&NetifConfiguration {
ip_configuration: Configuration::Client(ClientConfiguration::DHCP(DHCPClientSettings {
hostname: None,
})),
..NetifConfiguration::wifi_default_router()
})?;
How is this supposed to even work?
How is this supposed to even work?
That must have snuck in there somehow. I had it set to the correct configuration previously, but it seems like I somehow managed to clone the config into the wrong place 🤷. It does not affect the result though.
Even Fixed
is wrong. You can't set a client config on an access point and expect not to have issues w.r.t. routing on IP level. Why are you configuring the access point in the first place?
Can you: a) try to simplify this example to the bare minimum. Remove everything not necessary, including dns resolutions and whatnot b) try to report one problem only at first and an example about this single one problem.
You are sure 100% about that (as in you have used
IpAddr
and not a string that might still trigger DNS resolution somehow)?
I specified it as (Ipv4Addr::new(93, 184, 215, 14), 80)
.
I updated the code in the original post.
Also, this issue has nothing to do with the Access Point. EspWifi::wrap_all simply requires a netif for the AP to be present.
Maybe. But it has a default configuration and you are changing it. Maybe it does not matter, as the Wifi is not in Mixed mode, but maybe it does.
My point is the following: if you leave the example as-is, and with all the cases you have enumerated, it might take me days, if not weeks to get there.
And if/when I get there, I'll start by anyway with eliminating every single line of code in the example which should not be there, which is additional unknown variable that only confuses and obfuscates the issue. Like the DNS issue. Or configuring the AP. Etc. And then try to tackle just one case of the ones you have enumerated and which should work, but isn't.
... or you could try to do it, and then I can help with thinking / brainstorming. I think it might be faster that way. :)
I updated the code in the original post.
In your "simplified example"... why are you setting the eth and wifi-sta to the same IP, namely "192.168.178.234"?
Also, for the netif_ap
(I know it should not affect anything, but still) - can you please use the default network configuration, which would then be different from your 192.168.178.1/24, that you are setting everywhere.
@thetek42 What happens once you assign different static IPs to Eth and Wifi?
Sorry for the delay, I was busy the last couple of days.
Using the default values for the AP netif does not change anything.
Using a different IP address for Wi-Fi and Ethernet does in fact change the behaviour, except that it makes it worse. When both have a different static IP address configured, no matter in which order Wi-Fi and Ethernet are initialized, the connection to the server is not possible.
Most of the time, the device does not even show up in my router's user interface if that happens. However, I was able to observe it show up in the list of connections once, and then never again. In that incident, it showed the "device name" as "PC-192-168-178-235", even though the IP address was listed as 192.168.178.234 -- while the static IP address configuration for Ethernet was set to 192.168.178.235 and the IP for Wi-Fi was set to 192.168.178.234. I am unsure if this was just a bug by the router or if this is the actual behaviour. The fact that the device does not show up in the list makes sense considering that running it through Wireshark yielded no transmitted packets, but the fact that it turned up in the list exactly once does not make sense to me.
The more I investigate this issue, the more confused I am.
Again, I will update the code in the original post in order to use of default values for AP config and two different static IP addresses.
One little detail that is interesting: what list do you mean? The router does not keep any "list" of devices with static IP addresses (let alone knowing their host names) as it is just not aware of these devices.
More advanced routers might produce a "list" based on tx/rx packet statistics and potentially, based on nat statistics.
Btw: have you tried pinging the eth and wifi interfaces from another pc?
Another experiment would be to assign a static ip to the wifi interface which is from a completely different subnet (even if the wifi won't be reachable this way).
I was talking about the list that the router provides in the user interface of its website.
A different IP in a different subnet for the Wi-Fi interface does not have any effect.
Pinging the Ethernet interface does work.
Also, when setting the Ethernet IP address to .235 and the Wi-Fi IP address to .234, I can ping both, even though Wi-Fi is not connected.
I was talking about the list that the router provides in the user interface of its website.
See above. :) I would not trust this list for static IPs at all. :)
Also, when setting the Ethernet IP address to .235 and the Wi-Fi IP address to .234, I can ping both, even though Wi-Fi is not connected.
OK, hold on. So just to summarize. What you are saying:
Are you sure the Wifi is not connected? Otherwise, how pinging that IP would work?! Are you sure you don't have an IP conflict in your net, and you are not pinging something else, which is not the Esp?
I would not trust this list for static IPs at all. :)
I don't! :) I was just looking to see if the router recognized it at all. The weird results are to be expected.
I am 100% sure the Wi-Fi is not connected. The SSID and passphrase are both set to an empty string, and .connect()
isn't even called. There is no NVS in which Wi-Fi credentials might be stored. The .234 and .235 IP addresses are not taken up by any other device.
Interestingly enough, this double IP phenomenon thingy only happens sometimes.
1) Ok let's pretend that the second (wifi) IP was never pingable (or else I should start believing in miracles, OR there is a nasty bug/behavior in esp-idf somewhere, where the wifi netif is operational even though its phy layer is not - as in it gets ethernet packets from the eth phy layer which sounds very unlikely either).
2) let's also forget about the router list, shall we? If you have used that instead of pinging the static ips from the outside, this had been wrong all along.
Let's concentrate on the one remaining issue: from inside the esp, can you ping the gateway on the eth interface? And then - as a second step only - does opening the socket work?
Obviously, I used the ping
command for pinging the esp32.
It seems that it only happens for approximately 15-20 secs after flashing a new firmware on the device with a different static IP than the previous firmware. After that timespan, it does not occur again. The only way I can explain that is that there is some nasty caching going on somewhere, not neccesarily on esp-idf's side.
Pinging the gateway from the esp32 provides the exact same behaviour as opening a socket to somewhere. For this, I used the EspPing
. However, I was a bit unsure as to what to set the interface to, so I left it at 0. From what I gathered, you are supposed to put the netif index in there, but obtaining the netif index via .get_index()
and plugging that into EspPing
yielded to no ping succeeding, even when only Ethernet was enabled in the code. Is there something I missed or misunderstood?
Pinging the gateway from the esp32 provides the exact same behaviour as opening a socket to somewhere. For this, I used the
EspPing
. However, I was a bit unsure as to what to set the interface to, so I left it at 0. From what I gathered, you are supposed to put the netif index in there, but obtaining the netif index via.get_index()
and plugging that intoEspPing
yielded to no ping succeeding, even when only Ethernet was enabled in the code. Is there something I missed or misunderstood?
By the way, what are you trying to ping, and did you check that it is pingable in the first place?
I am pinging the gateway (192.168.178.1).
Ok assuming it is pingable from inside (99.9% routers are, but you never know and you should check it from a pc), can you try with just the ethernet connection, without even creating the wifi driver? Until we get a reliable ping, we cannot progress any further. Try with get_index + 1, or with index = 1 and/or = 2 until it works.
And just to confirm again: None of these issues happen if you don't create the wifi driver in the first place? But once you create it, even if you dont start it, issues start on the eth interface?
EspPing::default()
:
EspPing
seems unaffected by this and continues working even with two different static IPs.get_index() + 1
) works - even when creating the Wi-Fi interface after the Ethernet interface. I guess that is to be expected since we are directly using the Ethernet interface. I guess the get_index() + 1
behaviour could be fixed or at least documented somewhere, that way people won't stumble over it in the future.wifi.start()
. Just creating but not starting is fine.@thetek42 Weird. Did you just delete your last comment from 5 minutes ago?
Yes, I still had to fix one small part of the code.
I investigated the issue a bit further by trying it in C. It works perfectly there.
OK, that's great news. The question is now, what is different in the Rust code, if you compare it with C...?
I don't see any differences. It starts Ethernet, then it starts Wi-Fi (without credentails and without waiting for it connect). Both Ethernet and Wi-Fi have a static IP assigned to them (the same static IP, in fact). Then, I try to resolve a hostname via DNS and try to send a HTTP request to it. Both of those work.
No I mean the rust code in esp-idf-svc
which is driving the netif stack, the wifi and the ethernet drivers.
There must be a difference. And sorry for persisting that you look at it instead of me doing a deep dive. :(
I simply don't have the physical time to do it these days, as I'm chasing other problems.
Ok. I figured out what the issue is. If you initialize the netif like it is currently done in esp-idf-svc (with different flags and ip_info set), it does not work. However, if you do it just like the official C example here (that is, initializing the netif with dhcp enabled, and then disabling it and setting the static IP config afterwards), it works perfectly fine. Below is a git diff
of what I had to get it to work. I guess that the AP/dhcps code in there is not entirely correct, but I didn't bother with that since I was just trying to get it to work in the first place. I also don't know how much of this exactly is necessary.
diff --git a/src/netif.rs b/src/netif.rs
index 0bc1b784d..72a0daf1e 100644
--- a/src/netif.rs
+++ b/src/netif.rs
@@ -235,38 +235,13 @@ impl EspNetif {
{
ipv4::Configuration::Client(ref ip_conf) => (
esp_netif_inherent_config_t {
- flags: match ip_conf {
- ipv4::ClientConfiguration::DHCP(_) => {
- esp_netif_flags_ESP_NETIF_DHCP_CLIENT
- | esp_netif_flags_ESP_NETIF_FLAG_GARP
- | esp_netif_flags_ESP_NETIF_FLAG_EVENT_IP_MODIFIED
- }
- ipv4::ClientConfiguration::Fixed(_) => {
- esp_netif_flags_ESP_NETIF_FLAG_AUTOUP
- }
- },
+ flags: esp_netif_flags_ESP_NETIF_DHCP_CLIENT
+ | esp_netif_flags_ESP_NETIF_FLAG_GARP
+ | esp_netif_flags_ESP_NETIF_FLAG_EVENT_IP_MODIFIED,
mac: initial_mac,
ip_info: ptr::null(),
- get_ip_event: match ip_conf {
- ipv4::ClientConfiguration::DHCP(_) => {
- if conf.stack == NetifStack::Sta {
- ip_event_t_IP_EVENT_STA_GOT_IP
- } else {
- 0
- }
- }
- ipv4::ClientConfiguration::Fixed(_) => 0,
- },
- lost_ip_event: match ip_conf {
- ipv4::ClientConfiguration::DHCP(_) => {
- if conf.stack == NetifStack::Sta {
- ip_event_t_IP_EVENT_STA_LOST_IP
- } else {
- 0
- }
- }
- ipv4::ClientConfiguration::Fixed(_) => 0,
- },
+ get_ip_event: ip_event_t_IP_EVENT_STA_GOT_IP,
+ lost_ip_event: ip_event_t_IP_EVENT_STA_LOST_IP,
if_key: c_if_key.as_c_str().as_ptr() as _,
if_desc: c_if_description.as_c_str().as_ptr() as _,
route_prio: conf.route_priority as _,
@@ -297,11 +272,8 @@ impl EspNetif {
),
ipv4::Configuration::Router(ref ip_conf) => (
esp_netif_inherent_config_t {
- flags: (if ip_conf.dhcp_enabled {
- esp_netif_flags_ESP_NETIF_DHCP_SERVER
- } else {
- 0
- }) | esp_netif_flags_ESP_NETIF_FLAG_AUTOUP,
+ flags: esp_netif_flags_ESP_NETIF_DHCP_SERVER
+ | esp_netif_flags_ESP_NETIF_FLAG_AUTOUP,
mac: initial_mac,
ip_info: ptr::null(),
get_ip_event: 0,
@@ -324,10 +296,6 @@ impl EspNetif {
),
};
- if let Some(ip_info) = ip_info.as_ref() {
- esp_inherent_config.ip_info = ip_info;
- }
-
let cfg = esp_netif_config_t {
base: &esp_inherent_config,
driver: ptr::null(),
@@ -339,6 +307,12 @@ impl EspNetif {
.ok_or(EspError::from_infallible::<ESP_ERR_INVALID_ARG>())?,
);
+ if let Some(ip_info) = ip_info.as_ref() {
+ esp!(unsafe { esp_netif_dhcpc_stop(handle.0) })?;
+ esp!(unsafe { esp_netif_dhcps_stop(handle.0) })?;
+ esp!(unsafe { esp_netif_set_ip_info(handle.0, &*ip_info) })?;
+ }
+
if let Some(dns) = dns {
handle.set_dns(dns);
This is great news!... ... but a bit weird that we have to do it in such a strange way. What if there is simply no DHCP server on the network and we are just sitting there waiting for a DHCP address? Also the change where we run the AP DHCP server and then stop it is especially annoying. For a short while we would be running a DHCP server, and that might be unexpected for the user. Also, that last trick is not part of the C example anyway (the example is for STA only of course, but still)
Would you do another variation of your changes:
get_ip_event
and lost_ip_event
to always be equal to ip_event_t_IP_EVENT_STA_GOT_IP
and ip_event_t_IP_EVENT_STA_LOST_IP
respectively? Not sure if that matters, but why not?esp_netif_flags_ESP_NETIF_FLAG_GARP
and esp_netif_flags_ESP_NETIF_FLAG_EVENT_IP_MODIFIED
. Not sure about
esp_netif_flags_ESP_NETIF_FLAG_EVENT_IP_MODIFIED
, but my gut feeling is esp_netif_flags_ESP_NETIF_FLAG_GARP
might be important and the fact that we are not setting it for the fixed configuration might actually be a bug. So how about - for the STA fixed case - if we set the flags
to
esp_netif_flags_ESP_NETIF_FLAG_AUTOUP
| esp_netif_flags_ESP_NETIF_FLAG_GARP
| esp_netif_flags_ESP_NETIF_FLAG_EVENT_IP_MODIFIED
esp!(unsafe { esp_netif_dhcpc_stop(handle.0) })?;
esp!(unsafe { esp_netif_dhcps_stop(handle.0) })?;
... as we would not be doing the DHCP trick this way.
So, I did some more digging around. Apparently the only thing you actually need to do in order to make it work is to remove the esp_netif_flags_ESP_NETIF_FLAG_AUTOUP
! Simply replacing it with 0 makes it behave just as expected. (As already mentioned, the diff above was simply me messing around in an attempt to get it to work)
So, I did some more digging around. Apparently the only thing you actually need to do in order to make it work is to remove the
esp_netif_flags_ESP_NETIF_FLAG_AUTOUP
! Simply replacing it with 0 makes it behave just as expected. (As already mentioned, the diff above was simply me messing around in an attempt to get it to work)
You mean from the client (STA) configuration only? I guess it should stay in the server (AP) configuration?
Can you try with latest master
? I just removed the AUTOUP
flag from the client conf, and also set the GARP
and IP_MODFIFIED
flags even for fixed client configuration.
You mean from the client (STA) configuration only? I guess it should stay in the server (AP) configuration?
I only tried STA. I don't know about AP, but I'll check that as well when I have the time to do so. But I think that it should be fine with AP since I never had any issues with it.
Can you try with latest master? I just removed the AUTOUP flag from the client conf, and also set the GARP and IP_MODFIFIED flags even for fixed client configuration.
Seems to be working now. Thanks!!
Thank you for persisting through this journey and figuring out the root cause!
In our application, we are starting both Ethernet and Wi-Fi. When a static IP is configured for both interfaces and an Ethernet cable is connected to the ESP32, connections to a server cannot be made under certain circumstances.
Notes:
.connect()
is not called.These are the cases that I observed:
.wait_netif_up()
)The behaviour for "connect to Wi-Fi but not to Ethernet" (by not having an Ethernet cable connected) has the exact reverse effect: starting Wi-Fi and then Ethernet causes the connection to fail, but the reverse causes the connection to succeed.
Thus, the interface that was started last "determines" which interface can work with static IP. Again, for DHCP, all of this does not matter since it works all the time, no matter in which order the interfaces were started.
Below is a piece of sample code that can be used to (hopefully) reproduce the issue. Feel free to play around a bit by moving stuff around (e.g. putting Wi-Fi after Ethernet or putting the
eth.wait_*()
block after the Wi-Fi code).Example code
```rust #![allow(unused_imports)] use std::io::{Read, Write}; use std::net::{Ipv4Addr, TcpStream}; use esp_idf_svc::eth::{EspEth, EthDriver, RmiiEthChipset, RmiiClockConfig, BlockingEth, AsyncEth}; use esp_idf_svc::eventloop::EspSystemEventLoop; use esp_idf_svc::hal::gpio::{self, PinDriver}; use esp_idf_svc::hal::prelude::Peripherals; use esp_idf_svc::ipv4::{ClientConfiguration, Configuration, ClientSettings, Subnet, Mask, DHCPClientSettings, RouterConfiguration}; use esp_idf_svc::log::EspLogger; use esp_idf_svc::netif::{NetifConfiguration, EspNetif}; use esp_idf_svc::timer::EspTaskTimerService; use esp_idf_svc::wifi::{self, WifiDriver, EspWifi, BlockingWifi, AuthMethod, AsyncWifi}; fn main() -> anyhow::Result<()> { esp_idf_svc::sys::link_patches(); EspLogger::initialize_default(); let peripherals = Peripherals::take()?; let pins = peripherals.pins; let sys_loop = EspSystemEventLoop::take()?; let timer_service_wifi = EspTaskTimerService::new()?; let timer_service_eth = EspTaskTimerService::new()?; let mut eth_pwr = PinDriver::output(pins.gpio5)?; let mut clk_en = PinDriver::output(pins.gpio4)?; eth_pwr.set_low()?; clk_en.set_low()?; std::thread::sleep(std::time::Duration::from_millis(100)); eth_pwr.set_high()?; std::thread::sleep(std::time::Duration::from_millis(10)); clk_en.set_high()?; std::thread::sleep(std::time::Duration::from_millis(10)); log::info!("--- EthDriver"); let eth_driver = EthDriver::new( peripherals.mac, pins.gpio25, pins.gpio26, pins.gpio27, pins.gpio23, pins.gpio22, pins.gpio21, pins.gpio19, pins.gpio18, RmiiClockConfig::