ivmarkov / edge-net

async + no_std + no-alloc implementations of various network protocols
Apache License 2.0
101 stars 16 forks source link

Error running `mdns_responder` example on ESP32C3 #24

Closed Luni-4 closed 3 weeks ago

Luni-4 commented 1 month ago

My issue could be caused by my lack of experience and knowledge, so sorry in advance if that's the case.

I can build the mdns-responder example on esp32c3, but I cannot flash and run it because I'm getting the following error repeatedly:

ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0xc (RTC_SW_CPU_RST),boot:0xd (SPI_FAST_FLASH_BOOT)
Saved PC:0x4038152a
0x4038152a - esp_restart_noos
    at /.espressif/esp-idf/v5.2.1/components/esp_system/port/soc/esp32c3/system_internal.c:111
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fcd5820,len:0x1714
load:0x403cc710,len:0x968
load:0x403ce710,len:0x2f9c
entry 0x403cc710
I (19) boot: ESP-IDF v5.1.2-342-gbcf1645e44 2nd stage bootloader
I (20) boot: compile time Dec 12 2023 10:50:58
I (20) boot: chip revision: v0.4
I (24) boot.esp32c3: SPI Speed      : 40MHz
I (29) boot.esp32c3: SPI Mode       : DIO
I (34) boot.esp32c3: SPI Flash Size : 4MB
I (38) boot: Enabling RNG early entropy source...
I (44) boot: Partition Table:
I (47) boot: ## Label            Usage          Type ST Offset   Length
I (55) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (62) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (69) boot:  2 factory          factory app      00 00 00010000 003f0000
I (77) boot: End of partition table
I (81) esp_image: segment 0: paddr=00010020 vaddr=3c050020 size=136e8h ( 79592) map
I (107) esp_image: segment 1: paddr=00023710 vaddr=3fc8ae00 size=012d0h (  4816) load
I (109) esp_image: segment 2: paddr=000249e8 vaddr=40380000 size=0acf0h ( 44272) load
I (124) esp_image: segment 3: paddr=0002f6e0 vaddr=00000000 size=00938h (  2360)
I (125) esp_image: segment 4: paddr=00030020 vaddr=42000020 size=4a8e4h (305380) map
I (201) boot: Loaded app from partition at offset 0x10000
I (201) boot: Disabling RNG early entropy source...
I (212) cpu_start: Unicore app
I (221) cpu_start: Pro cpu start user code
I (221) cpu_start: cpu freq: 160000000 Hz
I (221) cpu_start: Application information:
I (224) cpu_start: Project name:     libespidf
I (229) cpu_start: App version:      1
I (234) cpu_start: Compile time:     Sep 10 2024 15:56:34
I (240) cpu_start: ELF file SHA256:  000000000...
I (245) cpu_start: ESP-IDF:          v5.2.1
I (250) cpu_start: Min chip rev:     v0.3
I (255) cpu_start: Max chip rev:     v1.99
I (260) cpu_start: Chip rev:         v0.4
I (264) heap_init: Initializing. RAM available for dynamic allocation:
I (272) heap_init: At 3FC8DB50 len 000324B0 (201 KiB): RAM
I (278) heap_init: At 3FCC0000 len 0001C710 (113 KiB): Retention RAM
I (285) heap_init: At 3FCDC710 len 00002950 (10 KiB): Retention RAM
I (292) heap_init: At 50000010 len 00001FD8 (7 KiB): RTCRAM
I (299) spi_flash: detected chip: generic
I (303) spi_flash: flash io: dio
W (307) timer_group: legacy driver is deprecated, please migrate to `driver/gptimer.h`
I (315) sleep: Configure to isolate all GPIO pins in sleep state
I (322) sleep: Enable automatic switching of GPIO sleep configuration
I (329) main_task: Started on CPU0
I (329) main_task: Calling app_main()
Guru Meditation Error: Core  0 panic'ed (Stack protection fault).

Detected in task "main" at 0x4200127a
0x4200127a - foo_esp::main
    at /foo-esp/src/main.rs:25
Stack pointer: 0x3fc8f9d0
Stack bounds: 0x3fc8fb30 - 0x3fc90b20

Core  0 register dump:
MEPC    : 0x4200127e  RA      : 0x4200176a  SP      : 0x3fc8f9d0  GP      : 0x3fc8b600
0x4200127e - foo_esp::main
    at /foo-esp/src/main.rs:26
0x4200176a - std::rt::lang_start::{{closure}}
    at /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:164
0x3fc8b600 - __func__.0
    at ??:??
TP      : 0x3fc7e0a4  T0      : 0x4005890e  T1      : 0x4201b608  T2      : 0xffffffff
0x4201b608 - <std::sys::sync::mutex::pthread::AllocatedMutex as std::sys_common::lazy_box::LazyInit>::init
    at /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/sync/mutex/pthread.rs:51
S0/FP   : 0x3c0505c0  S1      : 0x3fc90ad8  A0      : 0x4200126e  A1      : 0x42001762
0x3c0505c0 - $d
    at ??:??
0x4200126e - foo_esp::main
    at /foo-esp/src/main.rs:25
0x42001762 - std::rt::lang_start::{{closure}}
    at /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:164
A2      : 0x3fc9166c  A3      : 0x00000003  A4      : 0x3fc91628  A5      : 0x00000000
A6      : 0xa0000000  A7      : 0x0000000a  S2      : 0x00000000  S3      : 0x00000000
S4      : 0x00000000  S5      : 0x00000000  S6      : 0x00000000  S7      : 0x00000000
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x00000000  T4      : 0x00000000  T5      : 0x00000000  T6      : 0x00000000
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x0000001b  MTVAL   : 0x00012097
0x40380001 - _vector_table
    at ??:??
MHARTID : 0x00000000

Stack memory:
3fc8f9d0: 0x5f707365 0x656d6974 0x00000072 0x00000000 0x3fc8f990 0x00000016 0x00000000 0x00000000
3fc8f9f0: 0x00000000 0x00000000 0x3fc8e33c 0x3fc8e3a4 0x3fc8e40c 0x00000000 0x00000000 0x00000001
3fc8fa10: 0x00000000 0x00000000 0x00000000 0x420290c0 0x00000000 0x00000000 0x00000000 0x00000000
0x420290c0 - esp_cleanup_r
    at /.espressif/esp-idf/v5.2.1/components/newlib/newlib_init.c:60
3fc8fa30: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
3fc8fa50: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
3fc8fa70: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
3fc8fa90: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
3fc8fab0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
3fc8fad0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000001 0x0000000c
3fc8faf0: 0x3fc8fb00 0x00000000 0x3fc8fae8 0x0000000c 0x09c6000a 0x00000000 0x3fc8fb20 0x0000000c
3fc8fb10: 0x3fc914b4 0x00000000 0x3fc8fb08 0x0000000c 0x0c88000a 0x00000000 0x3fc913e0 0x00001000
3fc8fb30: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fb50: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fb70: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fb90: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fbb0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fbd0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fbf0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fc10: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fc30: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fc50: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fc70: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fc90: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fcb0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fcd0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fcf0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fd10: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fd30: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fd50: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fd70: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fd90: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
3fc8fdb0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5

ELF file SHA256: 000000000

Rebooting...
ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0xc (RTC_SW_CPU_RST),boot:0xd (SPI_FAST_FLASH_BOOT)
Saved PC:0x4038152a
0x4038152a - esp_restart_noos
    at /.espressif/esp-idf/v5.2.1/components/esp_system/port/soc/esp32c3/system_internal.c:111
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fcd5820,len:0x1714
load:0x403cc710,len:0x968
load:0x403ce710,len:0x2f9c
entry 0x403cc710
I (20) boot: ESP-IDF v5.1.2-342-gbcf1645e44 2nd stage bootloader
I (20) boot: compile time Dec 12 2023 10:50:58
I (20) boot: chip revision: v0.4
I (24) boot.esp32c3: SPI Speed      : 40MHz
I (29) boot.esp32c3: SPI Mode       : DIO
I (34) boot.esp32c3: SPI Flash Size : 4MB
I (38) boot: Enabling RNG early entropy source...
I (44) boot: Partition Table:
I (47) boot: ## Label            Usage          Type ST Offset   Length
I (55) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (62) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (69) boot:  2 factory          factory app      00 00 00010000 003f0000
I (77) boot: End of partition table
I (81) esp_image: segment 0: paddr=00010020 vaddr=3c050020 size=136e8h ( 79592) map
I (107) esp_image: segment 1: paddr=00023710 vaddr=3fc8ae00 size=012d0h (  4816) load
I (109) esp_image: segment 2: paddr=000249e8 vaddr=40380000 size=0acf0h ( 44272) load
I (124) esp_image: segment 3: paddr=0002f6e0 vaddr=00000000 size=00938h (  2360)
I (125) esp_image: segment 4: paddr=00030020 vaddr=42000020 size=4a8e4h (305380) map

This is my Cargo.toml:

[package]
name = "foo-esp"
version = "0.1.0"
edition = "2021"
license = "MIT OR Apache-2.0"

[dependencies]
# Esp
embedded-svc = "0.28"
esp-idf-svc = "0.49"

# Mdns-sd
edge-mdns = "0.3"
edge-nal = "0.3"
edge-nal-std = "0.3"
embassy-sync = "0.6"
embassy-time= {version = "0.3", features = ["std", "generic-queue-8"] }
futures-lite = "2.3.0"
rand = "0.8.5"

[build-dependencies]
embuild  = "0.32.0"

[profile.dev]
# Rust debug is too slow.
# For debug builds always builds with some optimization
opt-level = "s"

[profile.release]
codegen-units = 1 # LLVM can perform better optimizations using a single thread
debug = 2
debug-assertions = false
incremental = false
lto = 'fat'
opt-level = 's'
overflow-checks = false

This is my config.toml:

[build]
target = "riscv32imc-esp-espidf"

[target.riscv32imc-esp-espidf]
linker = "ldproxy"
runner = "espflash flash --monitor"
# Future - necessary for the experimental "native build" of esp-idf-sys with ESP32C3
# See also https://github.com/ivmarkov/embuild/issues/16
rustflags = ["--cfg", "espidf_time64", "-C", "default-linker-libraries"]

[unstable]
# Builds the `std` environment crate for the `esp` target.
# The `panic_abort` crate is built as default behavior for the binary.
build-std          = ["panic_abort", "std"]
# Enables only the following `std` features for the `esp` binary.
build-std-features = ["panic_immediate_abort"]

[env]
# Enables the esp-idf-sys "native" build feature (`cargo build --features native`) to build against ESP-IDF (v5.2.1)
ESP_IDF_VERSION = { value = "tag:v5.2.1" }

# These configurations will pick up your custom "sdkconfig.release", "sdkconfig.debug" or "sdkconfig.defaults[.*]" files
# that you might put in the root of the project
# The easiest way to generate a full "sdkconfig[.release|debug]" configuration (as opposed to manually enabling only the necessary flags via "sdkconfig.defaults[.*]"
# is by running "cargo pio espidf menuconfig" (that is, if using the pio builder)
#ESP_IDF_SDKCONFIG = { value = "./sdkconfig.release", relative = true }
#ESP_IDF_SDKCONFIG = { value = "./sdkconfig.debug", relative = true }
ESP_IDF_SDKCONFIG_DEFAULTS = { value = "./sdkconfig.defaults", relative = true }
# ESP-IDF will be installed in ~/.espressif so it can be reused across the different examples.
# See also https://github.com/esp-rs/esp-idf-sys#esp_idf_tools_install_dir-esp_idf_tools_install_dir
ESP_IDF_TOOLS_INSTALL_DIR = { value = "global" }

This is my code:

use core::net::{Ipv4Addr, Ipv6Addr};

use edge_mdns::buf::{BufferAccess, VecBufAccess};
use edge_mdns::domain::base::Ttl;
use edge_mdns::io::{self, MdnsIoError, DEFAULT_SOCKET};
use edge_mdns::{host::Host, HostAnswersMdnsHandler};
use edge_nal::{UdpBind, UdpSplit};

use embassy_sync::blocking_mutex::raw::NoopRawMutex;
use embassy_sync::signal::Signal;

use log::info;

use rand::{thread_rng, RngCore};

// Change this to the IP address of the machine where you'll run this example
const OUR_IP: Ipv4Addr = Ipv4Addr::new(127, 0, 0, 1);

const OUR_NAME: &str = "mypc";

fn main() {
    esp_idf_svc::sys::link_patches();
    esp_idf_svc::log::EspLogger::initialize_default();

    let stack = edge_nal_std::Stack::new();

    let (recv_buf, send_buf) = (
        VecBufAccess::<NoopRawMutex, 1500>::new(),
        VecBufAccess::<NoopRawMutex, 1500>::new(),
    );

    futures_lite::future::block_on(run::<edge_nal_std::Stack, _, _>(
        &stack, &recv_buf, &send_buf, OUR_NAME, OUR_IP,
    ))
    .unwrap();
}

async fn run<T, RB, SB>(
    stack: &T,
    recv_buf: RB,
    send_buf: SB,
    our_name: &str,
    our_ip: Ipv4Addr,
) -> Result<(), MdnsIoError<T::Error>>
where
    T: UdpBind,
    RB: BufferAccess<[u8]>,
    SB: BufferAccess<[u8]>,
{
    info!("About to run an mDNS responder for our PC. It will be addressable using {our_name}.local, so try to `ping {our_name}.local`.");

    let mut socket = io::bind(stack, DEFAULT_SOCKET, Some(Ipv4Addr::UNSPECIFIED), Some(0)).await?;

    let (recv, send) = socket.split();

    let host = Host {
        hostname: our_name,
        ipv4: our_ip,
        ipv6: Ipv6Addr::UNSPECIFIED,
        ttl: Ttl::from_secs(60),
    };

    // A way to notify the mDNS responder that the data in `Host` had changed
    // We don't use it in this example, because the data is hard-coded
    let signal = Signal::new();

    let mdns = io::Mdns::<NoopRawMutex, _, _, _, _>::new(
        Some(Ipv4Addr::UNSPECIFIED),
        Some(0),
        recv,
        send,
        recv_buf,
        send_buf,
        |buf| thread_rng().fill_bytes(buf),
        &signal,
    );

    mdns.run(HostAnswersMdnsHandler::new(&host)).await
}

Thanks in advance for your help!

Luni-4 commented 4 weeks ago

Ah ok, now I can see the communication issue, thanks for telling me how to improve messages!

Well, I will use quotes this time, I hope it will be much clearer

Is your phone even configured to resolve addresses via mDNS lookup?

Yes, it can. My Android version supports that.

can you ping - from your phone - your WIndows and MacOS workstations on the .local domain?

Yes, because I 've run a server on my laptops (Linux, Windows, and MacOS) which implements the mdns-sd crate and I can access, from my phone, to the main server page using the .local domain.

Can you ping your desktop from the phone by using its .local suffixed name?

The hostname of one of my laptop is konki. If I ping konki.local from my phone, I can reach that out.

Can you run avahi-browse -a from your phone or better yet - avahi-resolve?

I've installed Service Browser which implements Bonjour underlying in order to reply to your question and posted the results as images in addition to the corresponding log

Luni-4 commented 4 weeks ago

@Luni-4 Argh OK, I should have read this link as to what your Android phone actually implements. They don't have an "mDNS responder", not even an "mDNS queryer" but implement the simplest of all hacks which is called a "one-shot query".

We don't (yet) support in edge-mdns replying to those types of queries. It might not be too difficult to add support for this.

Ahh I see now, thanks for your investigation! It would be fantastic to access a C3 even from a smartphone if it would be deemed interesting :)

But in any case, let me know what the outcome of trying to ping your workstations is as well.

Yep, the answer should be in my previous message

ivmarkov commented 3 weeks ago

Is your phone even configured to resolve addresses via mDNS lookup?

Yes, it can. My Android version supports that.

Only with a "one-shot" query - at least according to their documentation - which we don't support yet, but I'll do a quick fix shortly.

can you ping - from your phone - your WIndows and MacOS workstations on the .local domain?

Yes, because I 've run a server on my laptops (Linux, Windows, and MacOS) which implements the mdns-sd crate and I can access, from my phone, to the main server page using the .local domain.

But this should not be necessary, as per my last message. Again - MacOS has a built-in mDNS queryer and resolver (because Bonjour was invented by Mac)

Windows also adopted bonjour out of the box long ago.

Can you ping your desktop from the phone by using its .local suffixed name?

The hostname of one of my laptop is konki. If I ping konki.local from my phone, I can reach that out.

That's what I wanted to know, thanks! Let's hope it is really the fact that we don't implement a one-shot query, although - if that was the case, we should've seen in the logs of your c3 that is is contacted by your phone (the "reply to" message). And I don't see that at all? :(

Can you run avahi-browse -a from your phone or better yet - avahi-resolve?

I've installed Service Browser which implements Bonjour underlying in order to reply to your question and posted the results as images in addition to the corresponding log

Right. Once you do that, the c3 is discovered. BUT, it is discovered by another mDNS resolver, this "Service Browser" thing. And it is somehow not discovered by your "ping" command which uses the underlying one-shot thing...

ivmarkov commented 3 weeks ago

@Luni-4 OK, I've extended the code to support "one-shot" queries so you can give it another try, but I'm not holding my breath. The fact that we don't see your phone at all in the c3 logs is not a good indicator...

Luni-4 commented 3 weeks ago

Only with a "one-shot" query - at least according to their documentation - which we don't support yet, but I'll do a quick fix shortly.

Yep, exactly.

But this should not be necessary, as per my last message. Again - MacOS has a built-in mDNS queryer and resolver (because Bonjour was invented by Mac)

Windows also adopted bonjour out of the box long ago.

Ah, sure, as I said I was testing, that's why I've run a personal server, but ok, solved.

That's what I wanted to know, thanks! Let's hope it is really the fact that we don't implement a one-shot query, although - if that was the case, we should've seen in the logs of your c3 that is is contacted by your phone (the "reply to" message). And I don't see that at all? :(

I cannot understand this part. Can you explain me better the we should've seen in the logs of your c3 statement?

Right. Once you do that, the c3 is discovered. BUT, it is discovered by another mDNS resolver, this "Service Browser" thing. And it is somehow not discovered by your "ping" command which uses the underlying one-shot thing...

Exactly, but my test was to verify whether another mDNS resolver could discover the C3. If it hadn't worked, it would have been useless going forward, in my opinion. But ok, solved.

Luni-4 commented 3 weeks ago

@Luni-4 OK, I've extended the code to support "one-shot" queries so you can give it another try, but I'm not holding my breath. The fact that we don't see your phone at all in the c3 logs is not a good indicator...

Thanks a lot! I do not have more time for today, I will try that in two days

ivmarkov commented 3 weeks ago

That's what I wanted to know, thanks! Let's hope it is really the fact that we don't implement a one-shot query, although - if that was the case, we should've seen in the logs of your c3 that is is contacted by your phone (the "reply to" message). And I don't see that at all? :(

I cannot understand this part. Can you explain me better the we should've seen in the logs of your c3 statement?

An mDNS resolver like the edge-mdns thing you are running on the c3 is not simply broadcasting all its data. It only does that once, when it is started (as per the mDNS protocol).

After that, it starts to actively listen on the broadcast addresses if other peers are asking questions. And if they do, then it is re-broadcasting, but is re-broadcasting answers to exactly the questions asked and not all its data (again, according to mDNS).

Now, a one-shot query is a trick where your phone is sending its questions to the standard mDNS bradcast addresses (as we do), but then - because it is not sending them from port 5353 - we are supposed to answer to it "privately", with unicast rather than broadcast. But because it is asking its questions on the broadcast addresses - even though we were not replying to it privately, we should've still seen a

I (43657) edge_mdns::io: Replying to mDNS query from [::ffff:192.168.178.115]:some-very-large-ephemeral-port

line in the log of the c3, where some-very-large-ephemeral-port != "5353".

... but we don't see that. :(

ivmarkov commented 3 weeks ago

@Luni-4

Good news. I re-installed "Termux" on my Android, and with the "one-shot" patch I pushed,

If I type:

ping mypc.local

The ping works.

and in the logs I see:

[2024-10-09T19:11:31Z INFO  edge_mdns::io] Replying privately to a one-shot mDNS query from [::ffff:192.168.10.199]:46205

HOWEVER: You really have to FIRST start the c3, and only then do the ping mypc.local command.

If you first try to ping (as I did), the ping will fail (c3 is not running). If you then run the c3 and try to re-ping, the ping will AGAIN fail, because - apparently - the Android implementation is so simple, that it caches the failure.

I had to literally restart my phone to flush this cache...

Luni-4 commented 3 weeks ago

@Luni-4

Good news. I re-installed "Termux" on my Android, and with the "one-shot" patch I pushed,

If I type:

ping mypc.local

The ping works.

and in the logs I see:

[2024-10-09T19:11:31Z INFO  edge_mdns::io] Replying privately to a one-shot mDNS query from [::ffff:192.168.10.199]:46205

@ivmarkov

Yeah! I can reproduce me too! Thanks a lot!

Here's the log:

I (4187) foo_esp: About to run an mDNS responder for our PC. It will be addressable using mypc.local, so try to `ping mypc.local`.
I (4207) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (47387) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (47387) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (47397) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (47397) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (107497) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (107497) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (121827) edge_mdns::io: Replying privately to a one-shot mDNS query from [::ffff:192.168.178.115]:43318

HOWEVER: You really have to FIRST start the c3, and only then do the ping mypc.local command.

If you first try to ping (as I did), the ping will fail (c3 is not running). If you then run the c3 and try to re-ping, the ping will AGAIN fail, because - apparently - the Android implementation is so simple, that it caches the failure.

I had to literally restart my phone to flush this cache...

Ugh, I had to restart my phone as well. Is it possible to find a workaround for this bad problem in your opinion?

There's another problem, if I stop cargo run with CTRL-C and then launch cargo run again, I cannot ping mypc.local from my phone anymore, even if the previous pings had worked perfectly. If Android caches the failure, it should also cache the success.

ivmarkov commented 3 weeks ago

@Luni-4 Good news. I re-installed "Termux" on my Android, and with the "one-shot" patch I pushed, If I type:

ping mypc.local

The ping works. and in the logs I see:

[2024-10-09T19:11:31Z INFO  edge_mdns::io] Replying privately to a one-shot mDNS query from [::ffff:192.168.10.199]:46205

@ivmarkov

Yeah! I can reproduce me too! Thanks a lot!

Here's the log:

I (4187) foo_esp: About to run an mDNS responder for our PC. It will be addressable using mypc.local, so try to `ping mypc.local`.
I (4207) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (47387) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (47387) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (47397) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (47397) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (107497) edge_mdns::io: Re-broadcasting due to mDNS query from [::ffff:192.168.178.1]:5353
I (107497) edge_mdns::io: Broadcasting mDNS entry to 224.0.0.251:5353
I (121827) edge_mdns::io: Replying privately to a one-shot mDNS query from [::ffff:192.168.178.115]:43318

HOWEVER: You really have to FIRST start the c3, and only then do the ping mypc.local command. If you first try to ping (as I did), the ping will fail (c3 is not running). If you then run the c3 and try to re-ping, the ping will AGAIN fail, because - apparently - the Android implementation is so simple, that it caches the failure. I had to literally restart my phone to flush this cache...

Ugh, I had to restart my phone as well. Is it possible to find a workaround for this bad problem in your opinion?

You can work on the Android code-base and figure out a more intelligent caching policy! Applying common sense, I don't see a way to fix this on the mDNS responder side. :-)

There's another problem, if I stop cargo run with CTRL-C and then launch cargo run again, I cannot ping mypc.local from my phone anymore, even if the previous pings had worked perfectly. If Android caches the failure, it should also cache the success.

This is not what I see on my end:

Which proves that that Android IS caching the success too. It is another topic if the "failure" is indeed caching forever (success caching seems to expire after 5 minutes or so).

Luni-4 commented 3 weeks ago

You can work on the Android code-base and figure out a more intelligent caching policy! Applying common sense, I don't see a way to fix this on the mDNS responder side. :-)

Ok, this is what I wanted to figure out. You cannot find a workaround for this problem on the edge-mdns side in any way, you just have to fiddle with Android settings or contribute to the Android codebase. Solved this question from my side.

This is not what I see on my end:

* If I stop the mDNS responder (while `ping` is not running) and then run it again, and then do ping the mDNS responder, ping works

* More importantly: If I assign "mypcl.local" to be 127.0.0.1, then run the responder, and the first ping resolves it (to 127.0.0.1), subsequent pings **continue** to work even if the responder is stopped.

Which proves that that Android IS caching the success too. It is another topic if the "failure" is indeed caching forever (success caching seems to expire after 5 minutes or so).

Ok, for the second observation, but your first observation does not work on my side right now. Anyway, this is not a problem. I can implement a mDNS resolver on my Android application and use that to resolve the address, so I do not need to directly use Android stuff.

Luni-4 commented 3 weeks ago

For me, we can close this issue now, since it works everywhere. I still have Wi-Fi problems right now, but they are more related to esp-idf-svc than this repository. Thanks a lot for your help and your explanations, very appreciated! :)

Luni-4 commented 3 weeks ago

@ivmarkov

Is it possible to publish a new edge-mdns version after all of these changes?

ivmarkov commented 3 weeks ago

Yes - in a couple of weeks or so.