arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
21.97k stars 4.77k forks source link

ESP32: BUG mdns cannot run concurrently with zigbee #17082

Closed xsp1989 closed 1 year ago

xsp1989 commented 1 year ago

PROBLEM DESCRIPTION

A clear and concise description of what the problem is.

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

- [ ] If using rules, provide the output of this command: `Backlog Rule1; Rule2; Rule3`:
```lua
  Rules output here:
- [ ] Set `weblog` to 4 and then, when you experience your issue, provide the output of the Console log:
```lua
  Console output here:

TO REPRODUCE

Self compiled via Visual-Studio-Code and these defines to use

  //mDNS
  #define USE_DISCOVERY                            // Enable mDNS for the following services (+8k code or +23.5k code with core 2_5_x, +0.3k mem)
  #undef MDNS_ENABLED
  #define MDNS_ENABLED          true              //enable MDNS

  //#define MY_LANGUAGE            zh_CN           // Chinese (Simplified) in China
  #undef MODULE
  #define MODULE USER_MODULE
  #undef FALLBACK_MODULE 
  #define FALLBACK_MODULE USER_MODULE
  #define USER_TEMPLATE "{\"NAME\":\"ZB-GW03-V1.2\",\"GPIO\":[0,0,3552,0,3584,0,0,0,5793,5792,320,544,5536,0,5600,0,0,0,0,5568,0,0,0,0,0,0,0,0,608,640,32,0,0,0,0,0],\"FLAG\":0,\"BASE\":1}"  // [Template] Set JSON template

  #undef LIGHT_MODE
  #define LIGHT_MODE false
  #undef USE_DOMOTICZ

  #undef OTA_URL
  #define OTA_URL ""
#undef APP_TIMEZONE
#define APP_TIMEZONE 8
#undef ROTARY_V1             // Add support for Rotary Encoder as used in MI Desk Lamp (+0k8 code)
#undef ROTARY_MAX_STEPS    // Rotary step boundary
#undef USE_SONOFF_RF         // Add support for Sonoff Rf Bridge (+3k2 code)
#undef USE_RF_FLASH          // Add support for flashing the EFM8BB1 chip on the Sonoff RF Bridge. C2CK must be connected to GPIO4, C2D to GPIO5 on the PCB (+2k7 code)
#undef USE_SONOFF_SC         // Add support for Sonoff Sc (+1k1 code)
#undef USE_TUYA_MCU          // Add support for Tuya Serial MCU
#undef TUYA_DIMMER_ID       // Default dimmer Id
#undef USE_TUYA_TIME         // Add support for Set Time in Tuya MCU
#undef USE_ARMTRONIX_DIMMERS // Add support for Armtronix Dimmers (+1k4 code)
#undef USE_PS_16_DZ          // Add support for PS-16-DZ Dimmer (+2k code)
#undef USE_SONOFF_IFAN       // Add support for Sonoff iFan02 and iFan03 (+2k code)
#undef USE_BUZZER            // Add support for a buzzer (+0k6 code)
#undef USE_ARILUX_RF         // Add support for Arilux RF remote controller (+0k8 code, 252 iram (non 2.3.0))
#undef USE_SHUTTER           // Add Shutter support for up to 4 shutter with different motortypes (+11k code)
#undef USE_DEEPSLEEP         // Add support for deepsleep (+1k code)
#undef USE_EXS_DIMMER        // Add support for ES-Store Wi-Fi Dimmer (+1k5 code)
#undef EXS_MCU_CMNDS                          // Add command to send MCU commands (+0k8 code)
#undef USE_HOTPLUG                              // Add support for sensor HotPlug
#undef USE_DEVICE_GROUPS                        // Add support for device groups (+5k5 code)
#undef DEVICE_GROUPS_ADDRESS  // Device groups multicast address
#undef DEVICE_GROUPS_PORT                  // Device groups multicast port
#undef USE_DEVICE_GROUPS_SEND                   // Add support for the DevGroupSend command (+0k6 code)
#undef USE_PWM_DIMMER                           // Add support for MJ-SD01/acenx/NTONPOWER PWM dimmers (+2k3 code, DGR=0k7)
#undef USE_PWM_DIMMER_REMOTE                    // Add support for remote switches to PWM Dimmer (requires USE_DEVICE_GROUPS) (+0k6 code)
#undef USE_KEELOQ                               // Add support for Jarolift rollers by Keeloq algorithm (+4k5 code)
#undef USE_SONOFF_D1     // Add support for Sonoff D1 Dimmer (+0k7 code)
#undef USE_SHELLY_DIMMER // Add support for Shelly Dimmer (+3k code)
#undef SHELLY_CMDS       // Add command to send co-processor commands (+0k3 code)
#undef SHELLY_FW_UPGRADE // Add firmware upgrade option for co-processor (+3k4 code)
#undef SHELLY_VOLTAGE_MON                     // Add support for reading voltage and current measurment (-0k0 code)
#undef USE_AUTOCONF
//#undef USE_BERRY

  // -- Optional light modules ----------------------
#undef USE_LIGHT                         // Add support for light control
#undef USE_WS2812                        // WS2812 Led string using library NeoPixelBus (+5k code, +1k mem, 232 iram) - Disable by //
                                          //  #define USE_WS2812_DMA                         // ESP8266 only, DMA supports only GPIO03 (= Serial RXD) (+1k mem). When USE_WS2812_DMA is enabled expect Exceptions on Pow
//#define USE_WS2812_RMT 0                  // ESP32 only, hardware RMT support (default). Specify the RMT channel 0..7. This should be preferred to software bit bang.
                                          //  #define USE_WS2812_I2S  0                      // ESP32 only, hardware I2S support. Specify the I2S channel 0..2. This is exclusive from RMT. By default, prefer RMT support
                                          //  #define USE_WS2812_INVERTED                    // Use inverted data signal
#undef USE_WS2812_HARDWARE                // Hardware type (NEO_HW_WS2812, NEO_HW_WS2812X, NEO_HW_WS2813, NEO_HW_SK6812, NEO_HW_LC8812, NEO_HW_APA106, NEO_HW_P9813)
#undef USE_WS2812_CTYPE                   // Color type (NEO_RGB, NEO_GRB, NEO_BRG, NEO_RBG, NEO_RGBW, NEO_GRBW)
#undef USE_MY92X1                        // Add support for MY92X1 RGBCW led controller as used in Sonoff B1, Ailight and Lohas
#undef USE_SM16716                       // Add support for SM16716 RGB LED controller (+0k7 code)
#undef USE_SM2135                        // Add support for SM2135 RGBCW led control as used in Action LSC (+0k6 code)
#undef USE_SM2335                        // Add support for SM2335 RGBCW led control as used in SwitchBot Color Bulb (+0k7 code)
#undef USE_BP5758D                       // Add support for BP5758D RGBCW led control as used in some Tuya lightbulbs (+0k8 code)
#undef USE_SONOFF_L1                     // Add support for Sonoff L1 led control
#undef USE_ELECTRIQ_MOODL                // Add support for ElectriQ iQ-wifiMOODL RGBW LED controller (+0k3 code)
#undef USE_LIGHT_PALETTE                 // Add support for color palette (+0k7 code)
#undef USE_LIGHT_VIRTUAL_CT              // Add support for Virtual White Color Temperature (+1.1k code)
#undef USE_DGR_LIGHT_SEQUENCE            // Add support for device group light sequencing (requires USE_DEVICE_GROUPS) (+0k2 code)
#undef USE_LSC_MCSL                             // Add support for GPE Multi color smart light as sold by Action in the Netherlands (+1k1 code)

#define USE_ZIGBEE
#undef USE_ZIGBEE_ZNP
#define USE_ZIGBEE_EZSP
#define USE_UFILESYS
#define USE_ZIGBEE_EEPROM // T24C512A
#define USE_TCP_BRIDGE
#undef USE_ZIGBEE_CHANNEL
#define USE_ZIGBEE_CHANNEL 11 // (11-26)

#define USE_ETHERNET
#undef ETH_TYPE
#define ETH_TYPE 0 // ETH_PHY_LAN8720
#undef ETH_CLKMODE
#define ETH_CLKMODE 3 // ETH_CLOCK_GPIO17_OUT
#undef ETH_ADDRESS
#define ETH_ADDRESS 1 // PHY1

EXPECTED BEHAVIOUR

A clear and concise description of what you expected to happen.

If mdns is enabled and zigbee cannot complete initialization (including 6.7.8 and 6.7.9) when running, if mdns is turned off with setoption55 0, initialization can be completed normally

SCREENSHOTS

Unable to start running zigbee when mdns is running image uart log: image

If mdns is turned off with setoption55, zigbee can start normally image

ADDITIONAL CONTEXT

Add any other context about the problem here.

(Please, remember to close the issue when the problem has been addressed)

s-hadinger commented 1 year ago

This is a strange bug since both are unrelated. I need to investigate

xsp1989 commented 1 year ago

I used [#undef MQTT_HOST_DISCOVERY] to turn off MQTT and found that zigbee can be started normally, but in the previous test, I found that sometimes the Ethernet will be disconnected immediately after connecting, not sure whether it is related to this bug relationship, maybe there is a memory overflow somewhere?

xsp1989 commented 1 year ago

The problem I located is that by calling MqttDiscoverServer()->MDNS.queryService()->mdns_query_ptr() as follows, the last function mdns_query_ptr() is set to block for 3000ms, resulting in a timeout. What do you think about this question. @s-hadinger

s-hadinger commented 1 year ago

That's probably why we don't use MDNS in the standard builds. It generally brings more problems that it solves.

Indeed having MDNS blocking for 3000ms is way too much and will generate all sorts of malfunctioning in the zigbee code, and other stuff.

Unfortunately none of the Tasmota maintainers use MDNS.

xsp1989 commented 1 year ago

I think in a gateway that supports Ethernet, set Ethernet to support, turn off WIFI by default, and then enable mdns device discovery, you can use Ethernet more easily, and you can do plug and play in addition to configuring MQTT. Currently disabled MQTT_HOST_DISCOVERY works fine.

s-hadinger commented 1 year ago

I don't know enough about MDNS to help here

s-hadinger commented 1 year ago

Reopening since I will need to explore mDNS for Matter support.

The main problem is #define MQTT_HOST_DISCOVERY that tries in loop to connect to a MQTT service with a 3s timeout.

I see much more value in Tasmota advertizing its name in mDNS rather than discovering the MQTT server. I will disable #define MQTT_HOST_DISCOVERY by default.

xsp1989 commented 1 year ago

I agree with your approach, it doesn't make much sense for tasmota to actively discover other MQTT servers.

By the way, I think dual-core ESP32 is better to use RTOS, such as freeRTOS, when one core is blocked and waiting, the other core can continue to run, just like multi-core CPU. Of course, blocking waits in the operating system is a stupid idea

s-hadinger commented 1 year ago

This is already the case with mdns in esp-idf. Even on single core, the mdns listener runs in a separate RTOS task with low priority.

The way it is currently coded, MQTT tries to connect, detects MDNS is enabled and does a MDNS query, that time outs after 3 seconds. I suppose it should be recoded as probing in background if MQTT MDNS exists and only if detected, start a MQTT connection (that would not time out).

Said differently, MDNS lookup should populate the MQTTHost entry, and later trigger a MQTT connection.

But as said, this use-case is of lesser importance.

Jason2866 commented 1 year ago

@s-hadinger Why not remove completly mqtt broker mDNS discovery? It is always a bad idea not to know the IP of the broker (or having real DNS for). Since this is the first issue (i have seen for), it is not really used. Maybe as an enhancement a second (fallback) mqtt broker (like the two SSIDs)?

s-hadinger commented 1 year ago

Let's keep it disabled by default. I'm always worried that someone is using it and finds it useful

barbudor commented 1 year ago

Dual MQTT broker infrastructure would be something interresting to setup for a resilient system. In case some devices connects to the 2nd one because of a transcient connection issue, the 2 instances must be able to exchange messages to work as one. In theory it should never happen but I've learned not to count on "never". I believe 2 mosquitto instances can be connected to exchange messages but I'm not sure if this is a static configuration or if it can be dynamic.

s-hadinger commented 1 year ago

If we want such behavior, it should be managed at Tasmota level. I.e. register a second MQTT host if the first fails. MDNS is not meant for that and having 2 MQTT brokers advertised on the same network will result in bad things happening.

Moreover, the two MQTT brokers need to communicate with each other...

Jason2866 commented 1 year ago

That was for me the only aspect for mDNS beeing of any use. Understanding the complexity of a second broker, it is no reason for mDNS at all.

xsp1989 commented 1 year ago

In my opinion, on embedded devices, the main role of mdns is to allow other devices to discover their own IP and display the services of the device, rather than the embedded device to discover other devices.

s-hadinger commented 1 year ago

I still keep that mDNS is useful to connect from a web browser to a Tasmota device if you don't want to manage the list of dynamic IPs. Still keep in mind that mDNS is based on broadcast, so it won't traverse VLANs. If your PC is in a VLAN and Tasmota devices in another VLAN, mDNS is of no help. (or maybe tweaking your router)

xsp1989 commented 1 year ago

At present, I have compiled a firmware that only supports Ethernet (WIFI is turned off by default), which can be used by plugging in the Internet cable, but if you need to know the IP address, you must log in to the router management page to view it. If you have mDNS, you can know the IP address of tasmota by scanning on your phone.

s-hadinger commented 1 year ago

Yes, this is the use case for mDNS I find the most valuable. It works as long as you make the mDNS request on the same VLAN.

Jason2866 commented 1 year ago

Closing, since the use case of the feature generating the issue is very limited. PRs to solve are welcome. The main devels will not work on this.