Google Cast Integration not detecting Cast Devices

N5A commented 4 years ago

The problem

Google Cast devices not detected by HA when adding cast integration

Environment

Host OS Windows 10

HA Virtual Machine - - VirtualBox Operating System HassOS 4.17 HA Version 118.3

https://www.home-assistant.io/integrations/cast/

Problem-relevant `configuration.yaml`

None - Adding integration though UI.

Traceback/Error logs

No errors in Supervisor or Core logs shown for Cast. I do see one for supervisor.api.ingress, unknown if it has a part to play in the issue. 20-11-25 20:08:34 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None

Additional information

Home Assistant Virtual Machine and Google devices all on the same IoT Vlan. Google cast integration can’t find the google devices, half dozen of them some minis, a hub, and a few CC Audios.

WIn 10 PC hosting the VM is dual networked to both the Iot Vlan and the Trusted Vlan. HA VM is bridged through the IoT Vlan network link, and it has its own IP on the Vlan.

My phone, Laptop, and PCs, on the trusted network can see and interact with the casts which are in the IoT VLAN

So mDNS on my ubiquity gear is working, IGMP snooping on the IoT VLAN is now disabled. it was enabled. no change either way.

I have not attempted manually adding them as of yet.

N5A commented 3 years ago

One problem I fought with and its just come to mind... network card metrics... IE.. importance level.

A VPN will usually commandeer Metric 1, highest order. I had industrial software that needs to speak to a licensing server, even if the server is on the same laptop as the software an active VPN will cause it to fail to contact the server, even having the application use the loopback to look at itself doesn't solve it. I had to organize the laptop network cards with fixed metrics at 2 and 3, since the VPN will always jump to 1 when activated, I can force it to another metric after the VPN connection is established that is lower order than 3, so 4+ and my industrial software can find its licensing server.. on the same laptop.

Similar could be causing issues on the cast indirectly when windows handles the network cards. I don't have any VPN adapters on mine but Marks mention of the Tap driver.. a VPN connection makes me wonder if somehow the metric of network cards is playing into it. It could very well be as windows would hear the mDNS response on both network cards I have connected to the VLANs would relay those mDNS packets and it might pay attention to the first one hearing them and filter them out on the second?

emontnemery commented 3 years ago

@N5A zeroconf will by default transmit to and send to all interfaces on the broadcast address. The filtering done by zeroconf will ignore identical messages, there's no benefit from getting the same message multiple times from different interfaces it's only the payload which is important, not on which interface it was received.

Still, it's an interesting idea that Windows may be doing some filtering. There's a known issue with running HA directly in Windows (i.e. no VM), where Windows is blocking or filtering MDNS traffic.

N5A commented 3 years ago

I hate problems that appear and vanish with no rhyme or reason... makes them hell to track down.

emontnemery commented 3 years ago

I fully agree. I'm not sure how to make MDNS debugging easier, there are too many unknowns and it's not easy to narrow down where and why packets are dropped. One possibility would be to have a simple tool in HA UI to publish MDNS messages + listen for answers and combine this with a simple command line tool + android app to narrow down if HA can send MDNS messages to LAN and WiFi and if MDNS messages originating from LAN and WiFi can be received by HA.

MarkHofmann11 commented 3 years ago

Is it possible to make the cast component not require mDNS if you have the UUID already statically defined? I'm just thinking about any particular circumstance where there is no way to fix mDNS for someone's setup, that they will still have another option to make it work.

emontnemery commented 3 years ago

As mentioned earlier, static IP support was removed because:

Google crippled the local API some time ago such that we need MDNS data to identify the cast's model name etc.
- This was the last drop, there were several issues with users being unhappy that the cast's model name etc was not showing
Static IP does not work for audio groups
Support for both static IP and discovery complicated the code considerably

Since all official apps rely on working MDNS, I don't think it's entirely unreasonable to require working MDNS for HA.

MarkHofmann11 commented 3 years ago

I was referring to static UUID (not IP), as I understand the reasons it was removed. Maybe I don't understand enough about how the process works. In my mind, entering static UUID entries for each cast device is similar to a static IP - just using a different identifier. I just didn't know if there was an option to use static UUID, but it sounds like it doesn't work the way I'm thinking.

MarkHofmann11 commented 3 years ago

Discovered this today - Windows 10 DNS client is also listening on UDP 5353 and I have tried every reg hack to disable it, but it still shows up. This is likely why people running Windows HA and the Cast component are having erratic discoveries. They eventually work, but still random. I highlighted the (2) processes (one is HA/Python and the other is the Windows 10 DNS client).

Active Connections

Proto Local Address UDP 0.0.0.0:500 UDP 0.0.0.0:3702 UDP 0.0.0.0:3702 UDP 0.0.0.0:4500 UDP 0.0.0.0:5050 UDP 0.0.0.0:5353 UDP 0.0.0.0:5353 UDP 0.0.0.0:53299 UDP 127.0.0.1:1900 UDP 127.0.0.1:5353 UDP 127.0.0.1:49952 UDP 127.0.0.1:53296 UDP 127.0.0.1:53298 UDP 127.0.0.1:62006 UDP 169.254.203.33:137 UDP 169.254.203.33:138 UDP 169.254.203.33:1900 UDP 169.254.203.33:5353 UDP 169.254.203.33:62005 UDP 192.168.0.14:137 UDP 192.168.0.14:138 UDP 192.168.0.14:1900 UDP 192.168.0.14:5353 UDP 192.168.0.14:62004 Foreign Address State PID : 2936 : 4516 : 4516 : 2936 : 5728 : 1600 : 9364 : 4516 : 6804 : 9364 : 756 : 1692 : 4092 : 6804 : 4 : 4 : 6804 : 9364 : 6804 : 4 : 4 : 6804 : 9364 : 6804

emontnemery commented 3 years ago

@MarkHofmann11 i thought you were running in a supported VM, not native Windows? Running HA natively in Windows is not supported.

Edit: You previously described your setup as:

VMware ESXi 5.5 and Win10 VM

Does this mean the host is VMware ESXi 5.5, in which you run a Windows 10VM, then you run HA in that Windows VM?

MarkHofmann11 commented 3 years ago

Yes, the physical host is a VMware ESXi 5.5 server. One of the VMs running is a Windows 10 VM where I run HA.

I tried experimenting with setting "default_interface = true" for :zeroconf but that made it only listen/bind to 0.0.0.0 and none of the other local addresses. I saw this thread concerning zeroconf and Windows: https://github.com/senecajs/seneca-transport/issues/37

What I would like to test is making a modification to the zero config library so it uses the loopback 127.0.0.1 vs. 0.0.0.0 and see if that changes anything. I'm not sure what library HA uses for zeroconf, but still investigating.

emontnemery commented 3 years ago

Running HA in Windows is not a supported configuration.

MarkHofmann11 commented 3 years ago

Finally figured this out - solution should be noted somewhere on the zeroconf wiki for HA. This applies to anyone running HA as a VM (VMware workstation - or ESXi). I ended up using an mDNS browser via wireless laptop and wired desktop. Both showed all my mDNS services. My HA VM however didn't show everything.

Solution: Change the vNIC from "VMXnet 3" to "E1000E". The virtual NIC that VMware provides to doesn't work 100% with multicast. I confirmed the info here: https://stackoverflow.com/questions/10495543/ping-hostnames-using-avahi-ubuntu-in-vmware-no-resolving and also here: https://communities.vmware.com/t5/ESXi-Discussions/Multicast-with-E1000-vs-VMXNet-3/m-p/2199270/highlight/true

Here is another tread on it: https://communities.vmware.com/t5/VMware-Workstation-Pro/Support-for-Multicast-traffic-from-Guests-on-VMWare-workstation/m-p/1980922

Once I switched vNICs and did an mDNS browse, it showed everything and HA is working properly, too. So for anyone running VMware ESXi or VMware Workstation using the "VMXnet 3" vNIC, you will want to switch to the "E1000E" to get all multicast working 100%.

Just an FYI - I'm running a WIN10 VM and HA runs in a Python venv. I also have appdaemon running on the same Win10 system in a different Python venv.

emontnemery commented 3 years ago

Thanks for the update @MarkHofmann11! If the change of vNIC from "VMXnet 3" to "E1000E" is of importance also for Linux guests, a PR for https://www.home-assistant.io/hassio/installation/ would be really welcome (just click on the "Edit this page on Github" link at the top of the page).

MarkHofmann11 commented 3 years ago

Will do - as this apparently effects all VMs regardless of OS being used (various Linux, Windows, etc) if you are using the VMXnet3 vNIC. I can only confirm the issue with ESXi 5.5 using the latest VMWare tools. Will add that info to the docs, as hopefully others can avoid this oddity.

Fettkeewl commented 3 years ago

Hi, I'm experiencing some similar issues. Theese type of things are beyond me. Google Cast worked before and I changed nothing. Had all devices setup automatically via integration and yesterday I wanted to try out TTS and no devices were reachable. Can someone take a look at my logs ? this is a capture of events from me trying to config devices with cast integration untill it says it failed..

Running Hassio 0.188.5 on Oracle VM inside a Windows 10 PC. Never had issues with discovery, id rule out mdns not working as it should in its entirity. All my cast devices are found via phones and PCs. I'm totally lost at what to test.. theese logs are straight up mumbojumbo to me cant interpret all the data :p

tailresult.txt

emontnemery commented 3 years ago

@Fettkeewl I'm not sure why you want to rule out MDNS not working, because that's exactly what's going on in the logs..

First of all, try rebooting your WiFi access points, they are sometimes buggy and stop forwarding multicast UDP between wired and wireless interfaces. Please reboot the WIndows host also.

Then please make sure Virtualbox networking in correctly configured according to the documentation (bridge mode, not NAT).

Fettkeewl commented 3 years ago

Oh alright, I'll take your word for it, I just thought that since it had worked previously, found my Cast devices, the new Tasmota beta integration etc etc. And I did no changes I to my system therefore I just assumed something else was a miss. I've done that, several times, included rebooting my pc and my chrome cast devices. I've even specifically allowed udp port 5353 on my host firewall, do I need to open that port on my router aswell?

Virtualbox network mode is set to bridge, everything else is working as it should except finding those darned cast devices :) I can ping my HA server without issues.

Fettkeewl commented 3 years ago

Some additional info if it helps. I am able to ping my HA server from the machine hosting the VM itself using "Ping hassio"

and from HA frontend, using SSH terminal I can ping my router and my home mini using their names, isnt this an indication that mdns is working?

emontnemery commented 3 years ago

For the HA guest to to be able to discover cast devices through MDNS, HA must be able to send multicast UDP packets to the cast devices. The cast devices must then be able to send replies, again as multicast UDP packets, back to the HA guest.

A common issue is that WiFi APs don't forward multicast UDP between the wired and wireless interfaces. Look for any setting in the AP related to multicast UDP, MDNS, Bonjour, Avahi or UPNP. If you have the possibility to connect the host to the LAN through WiFi instead of cable, give that a try.

To narrow down the source of the problem, you can run wireshark on the same LAN (192.168.1.x in your case), both when connected with cable and WiFi to check if you see the queries from HA as well as replies from the cast devices.

Fettkeewl commented 3 years ago

Alright, will give it a go to night I hope. Host is WiFi connected to AP, and "wired" to Guest Cast devices are all WiFi connected to the same AP as host.

I know for a fact that UPNP is disabled, I did so because I read about it being an vulnerability, I believe I tried enabeling it just to see if it would help and no dice. Can try again.

MarkHofmann11 commented 3 years ago

@Fettkeewl - One thing that can help troubleshoot, is download the Bonjour Browser here: https://hobbyistsoftware.com/bonjourbrowser

Run that inside your OracleBox VM and one on a physical PC (one wired and one wireless). You will likely see different results on your OracleBox VM (less discovered items).

If OracleBox is similar to VMware ESXi, you will want to ensure you are using one of the following vNICs for your VM (and not the last one):

PCnet-PCI II (Am79C970A) PCnet-Fast III (Am79C973) Intel PRO/1000 MT Desktop (82540EM) Intel PRO/1000 T Server (82543GC) Intel PRO/1000 MT Server (82545EM) Paravirtualized network adapter (virtio-net) - Not this one

Also, ensure it is setup for Bridge mode to your physical NIC and not NAT.

In my case, it was VMware's vNIC causing the discovery to not work properly. I changed it to the "E100E" driver and it works fine now.

Fettkeewl commented 3 years ago

Run that inside your OracleBox VM

Can't / don't know how to do that, I've got hassio on my VM! No GUI, just a CLI that shows when the server boots. My network is set according to image, since day one :) had a good guide. vm network

More worried about router settings being a culprit. Will test wireshark when I get home. I did try disabling windows defender firewall, did not help so doubtful its because of it. UPnP is enabled, didnt do squat :) "Multicast rate" is the only thing I can think of that has anything to do with mDNS in my router. Could not find any settings in regards to wireless forwarding of UDP packets to wired.

Fettkeewl commented 3 years ago

sidenote: BonjourBrowser shows an internal_url as 192.168.1.100:8123

HA Isn't reachable through it because of https and I dont have an active "internal_url" in my HA config ..

From host pc:

And my wireshark capture, i think the red is when i pressed to scan for cast devices in HA.

Ugh all the mumbojumbo 📦 ...! From what I can see and interpret, Hassio is sending mdns queries but not getting any response on them. But! A while after my nest hub (192.168.1.48) is sending data over UDP to my Hassio... So some type of communication is occuring.

The info being sent according to wireshark

0R·7·K8·Y···· E ····@ @·"V···0···d·······=HTTP/1.1 200 OK 
CACHE-CONTROL: max-age=1800 
DATE: Thu, 10 Dec 2020 18:12:07 GMT 
EXT: 
LOCATION: http://192.168.1.48:8008/ssdp/device-desc.xml 
OPT: "http://schemas.upnp.org/upnp/1/0/"; ns=01 
01-NLS: 88ceca90-3af8-11eb-97b5-be9ca6601ada 
SERVER: Linux/4.9.135, UPnP/1.0, Portable SDK for UPnP devices/1.6.18 
X-User-Agent: redsonic 
ST: upnp:rootdevice 
USN: uuid:450459cc-404e-f014-4bd3-ca6c58ed18ca::upnp:rootdevice 
BOOTID.UPNP.ORG: 14 
CONFIGID.UPNP.ORG: 4

And this is what's at the "LOCATION" above.. (unformated copy paste below image)

<root xmlns="urn:schemas-upnp-org:device-1-0">
<specVersion>
<major>1</major>
<minor>0</minor>
</specVersion>
<URLBase>http://192.168.1.48:8008</URLBase>
<device>
<deviceType>urn:dial-multiscreen-org:device:dial:1</deviceType>
<friendlyName>Hubben</friendlyName>
<manufacturer>Google Inc.</manufacturer>
<modelName>Google Nest Hub</modelName>
<UDN>uuid:450459cc-404e-f014-4bd3-ca6c58ed18ca</UDN>
<iconList>
<icon>
<mimetype>image/png</mimetype>
<width>98</width>
<height>55</height>
<depth>32</depth>
<url>/setup/icon.png</url>
</icon>
</iconList>
<serviceList>
<service>
<serviceType>urn:dial-multiscreen-org:service:dial:1</serviceType>
<serviceId>urn:dial-multiscreen-org:serviceId:dial</serviceId>
<controlURL>/ssdp/notfound</controlURL>
<eventSubURL>/ssdp/notfound</eventSubURL>
<SCPDURL>/ssdp/notfound</SCPDURL>
</service>
</serviceList>
</device>
</root>

emontnemery commented 3 years ago

What you circled in red is indeed HA trying to locate cast devices by sending MDNS queries, but there are no answers visible to wireshark. The remaining stuff is unrelated.

On what device did you do the wireshark capture, is it on the host? If it's on the host, could you try doing the same capture on another device connected to same AP, for example another laptop?

You mentioned earlier you had some setting related to UPNP, was that on the AP? Also, try disabling the "Enable IGMP snooping" option.

Fettkeewl commented 3 years ago

What you circled in red is indeed HA trying to locate cast devices by sending MDNS queries, but there are no answers visible to wireshark. The remaining stuff is unrelated.

On what device did you do the wireshark capture, is it on the host? If it's on the host, could you try doing the same capture on another device connected to same AP, for example another laptop?

You mentioned earlier you had some setting related to UPNP, was that on the AP? Also, try disabling the "Enable IGMP snooping" option.

I did the captures on my host thats, in my previous post.

Below image I tried on my laptop, which has recently been restored to factory settings. It does not seem to pick up the mDNS queries from HA. 192.168.1.4 is my host PC 192.168.1.100 is VM Guest Hassio 192.168.1.48 is the Nest hub

IGPM snooping disabled. UPnP Enabled on AP ( Was disabled)

emontnemery commented 3 years ago

If you start the Google Home app on an Android phone connected to the same AP, the phone will attempt the same discovery as HA does. Do you see the MDNS questions from the phone + answers from casts in wireshark both on the laptop and on the VM host?

Fettkeewl commented 3 years ago

If you start the Google Home app on an Android phone connected to the same AP, the phone will attempt the same discovery as HA does. Do you see the MDNS questions from the phone + answers from casts in wireshark both on the laptop and on the VM host?

Nothing on VM host.. Other than previous mdns chatter from hassio. Would have expected similar queries from phone on 192.168.1.12 but alas nothing. Will try laptop tomorrow. Beginning to think this might be a windows 10 issue

Fettkeewl commented 3 years ago

SUCCESS!!!!!!!!!!!!!!!! 👍 Edit : not.. Spoke to soon. Udp Comms fell off again. Might be when I start Oracle VM.

When I looked at Win 10 as the culprit. What I did this morning was run the command below. Which disables multicast on windows 10 id say, so windows 10 multicast I guess was blocking udp traffic or ignoring it. Didn't even have to install bonjour as some searches mentioned :) Wonder how this is the culprit ah well, aslong as it works im satisfied. Wireshark traffic went from 0->100 after reboot.

All devices are reachable in HA now :) Thank you for helping me narrow this down @emontnemery I would not have been able to conclude this on my own, atleast not this fast 😛

REG ADD "HKLM\Software\Policies\Microsoft\Windows NT\DNSClient" /V "EnableMulticast" /D "0" /T REG_DWORD /F

emontnemery commented 3 years ago

@fettkeewl so is it working or not now?

It's a little bit surprising that Windows interferes with the networking of the guest in bridged mode. Just to be clear, you mention "Oracle VM", this is a Virtualbox, correct?

Would you mind trying to spin up a HA VM in Hyper-V VM to verify you see the same problem there?

Fettkeewl commented 3 years ago

@Fettkeewl so is it working or not now?

It's a little bit surprising that Windows interferes with the networking of the guest in bridged mode. Just to be clear, you mention "Oracle VM", this is a Virtualbox, correct?

Would you mind trying to spin up a HA VM in Hyper-V VM to verify you see the same problem there?

Nah it's not.. heres why,

When I boot my Windows 10 pc and start wireshark to listen for udp
I get alot of traffic untill the second to last point seen in image. (Notice this is when Oracle isn't even running)
Seems to always stop on that point source fe80::8e6a::.... to dest ff02::fb.
By the time it hits I have not even started Virtualbox and my Hassio
When Hassio is up and running the UDP traffic has stopped and I revert to previous image Only seeing udp data from 192.168.1.4 and .100

How I got it to recognize my Cast devices was actually

Hassio running on Host
Rebooted my laptop and quickly opened wireshark (saw the same traffic as in the image below) THEN and only THEN did Hassio recognize my cast devices. But they quickly became unreachable.

On my host PC I have disabled all autostart programs I had runing, allway sync, plex, utorrent, unified remote and probably something else. Still the same behaviour.

My thought was that *It works in the beginning then stops, maybe its some type of application launched that blocks it. I'm at work now so can't verify if this is happening on my Laptop aswell, that it starts receiving udp data then stops..

Oh and yes Oracle VirtualBox (latest v. to this date)

I'll try it this weekend, never used that software so I'd say it's an learning curve aswell.

why

Fettkeewl commented 3 years ago

Rebooting PC and starting bonjour browser on logon. I capture every device, starting wireshark, see the devices talk... then they drop off why2

Fettkeewl commented 3 years ago

Well.. I'm at a loss for words..

disabling Ipv6 (wich I do not use anyway to my knowledge) seems to have unplugged what ever was blocking udp traffic. It's like beeing in the matrix now! If it's a sustainable solution I do not know but it sure is working and not stopping. Also all devices are still reachable, I'm one device down cant see my Sony TV but I think turning it on will resolve it.

Edit: Reading here https://en.wikipedia.org/wiki/Multicast_DNS

An mDNS message is a multicast UDP packet sent using the following addressing:

IPv4 address 224.0.0.251 or IPv6 address ff02::fb

And knowing that my UDP packets stop showing up after I saw FE:80... doin a multicast to ff02::fb (in previous image) this seems like a proper solution..

https://medium.com/@JockDaRock/disabling-ipv6-on-network-adapter-windows-10-5fad010bca75

emontnemery commented 3 years ago

OK, great that it works for you but really not good if it's acting random..

You have turned quite a lot of knobs now, with changes to router settings, Windows DNS multicast setting and finally IPv6. If you revert all settings except keeping IPv6 disabled, does it still work?

I'm also still interested to know if it works with Hyper-V or if you get the same symptoms there? Hyper-V is built-in to Windows 10 and simple to setup. Here's a link to my virtual switch configuration which is working fine: https://github.com/home-assistant/core/issues/43652#issuecomment-734882407

Fettkeewl commented 3 years ago

OK, great that it works for you but really not good if it's acting random..

You have turned quite a lot of knobs now, with changes to router settings, Windows DNS multicast setting and finally IPv6. If you revert all settings except keeping IPv6 disabled, does it still work?

I'm also still interested to know if it works with Hyper-V or if you get the same symptoms there? Hyper-V is built-in to Windows 10 and simple to setup. Here's a link to my virtual switch configuration which is working fine: #43652 (comment)

I'll try to revert everything done except the IPv6 change and try hyper-V if my wife can spare me the time this weekend :P

Joikast commented 3 years ago

Hi, it's me again.

I just noticed that my google home was no longer detected by Home Assistant. I tried to remove the Cast integration and added it again, but it fails to discover my Google home mini.

I have not done any other changes to my network. The Google home mini and the HA installation are on the same subnet and can ping each other.

I have tried both vNIC drivers: E1000 and VMXNet3, none of them seems to be working.

I cannot tell exactly when this stopped working, but today I updated my installation so I am now running:

System Health

version	core-2021.2.3
installation_type	Home Assistant OS
dev	false
hassio	true
docker	true
virtualenv	false
python_version	3.8.7
os_name	Linux
os_version	5.4.94
arch	x86_64
timezone	Europe/Stockholm

Home Assistant Cloud

Hass.io

Lovelace

I don't know what version it was that was the latest working one, but below is the version where I had issues. When I upgraded it, it started to work. So some version after the below version was working for me.

Host OS vmware ESXi 6.5 Home Assistant deployed through .ova IP HA: 192.168.137.103 IP Google home mini: 192.168.137.60 System Health Home Assistant Core Integration version: 0.118.3 installation_type: Home Assistant OS dev: false hassio: true docker: true virtualenv: false python_version: 3.8.6 os_name: Linux os_version: 5.4.77 arch: x86_64 timezone: Europe/Stockholm

Hass.io host_os: HassOS 4.17 update_channel: stable supervisor_version: 2020.11.0 docker_version: 19.03.12 disk_total: 43.6 GB disk_used: 7.5 GB healthy: true supported: true board: ova supervisor_api: ok version_api: ok

Any suggestions?

Regards

Fettkeewl commented 3 years ago

I haven't had any time to try anything with this. I've narrowed down my culprit to non working mDNS. After every reboot of my pc I must toggle ipv6 on MY WiFi NIC then mDNS starts working again. Strange issue indeed.

Running hassio on oracle vm on a windows 10 pc.

Joikast commented 3 years ago

I can kind of confirm that the issue is related to Vmware.

I Created a new server in Virtualbox running on my Windows machine, using the following Versions:

It directly discovered the google cast, as well as my Sony TV through HomeKit:

So, something with vmware and the mDNS must be broken. This also rules out my wifi, since the TV and the google home are connected to different access points (however on the same subnet).

I have tried to change the NIC driver on the VM (E1000 vs VMXET), enabled/disabled promiscious mode on the networking in vmware but all without success.

It seems also to be working in some HA versions, because I had this working for some months until recently.

Any suggestions?

MarkHofmann11 commented 3 years ago

I'm running HA on a ESXi VM - running Windows. Based on what you mentioned, did you try to disable the Windows mDNS - which is likely intercepting and not allowing the mDNS traffic to your VMware workstation setup. There is information on a few ways to do it, here: https://www.blackhillsinfosec.com/how-to-disable-llmnr-why-you-want-to/

You can disable it via GPO or registry.

Joikast commented 3 years ago

I'm running HA on a ESXi VM - running Windows. Based on what you mentioned, did you try to disable the Windows mDNS - which is likely intercepting and not allowing the mDNS traffic to your VMware workstation setup. There is information on a few ways to do it, here: https://www.blackhillsinfosec.com/how-to-disable-llmnr-why-you-want-to/

You can disable it via GPO or registry.

Thabo you, but this is my non-working setup: https://github.com/home-assistant/core/issues/43652#issuecomment-734696424

So, no Windows involved.

MarkHofmann11 commented 3 years ago

Correct me if I didn't read this correctly, but you mentioned the physical computer itself is running Windows. If that is the case, it could be not allowing mDNS to be passed through to the VMware workstation setup without the registry or GPO changes to disable Microsoft's mDNS.

I disabled it on my VM just to be safe, even though in my case it seemed to be more related to the vNIC being used for ESXi. I had issues before too, but I changed the vNIC to E1000 (under ESXi) and also did the registry change in Windows to disable mDNS. There is also an option in ESXi >6.5 where you can enable multicast on the virtual distributed switch. I did that as well just to be safe.

Joikast commented 3 years ago

Correct me if I didn't read this correctly, but you mentioned the physical computer itself is running Windows. If that is the case, it could be not allowing mDNS to be passed through to the VMware workstation setup without the registry or GPO changes to disable Microsoft's mDNS.

I disabled it on my VM just to be safe, even though in my case it seemed to be more related to the vNIC being used for ESXi. I had issues before too, but I changed the vNIC to E1000 (under ESXi) and also did the registry change in Windows to disable mDNS. There is also an option in ESXi >6.5 where you can enable multicast on the virtual distributed switch. I did that as well just to be safe.

Sorry, my post might be a bit confusing.

On my main HA installation, where I have my issues with google cast, I am running on ESXi 6.5 in a linux environment.

The windows environment was just a test that I did with virtualbox, and that worked directly with google cast and mDNS.

emontnemery commented 3 years ago

Since it seems impossible to have a reliable mDNS setup in some cases, I think the only solution is to add the possibility to specify known IP-addresses of chromecasts.

This PR of pychromecast https://github.com/home-assistant-libs/pychromecast/pull/469 together with this hack, https://github.com/emontnemery/home-assistant/tree/cast_known_hosts_tmp, makes the following possible:

Maintain a list of IP-addresses will be repeatedly scanned.
- Hardcoded IP-addresses can be added
- IP-addresses of casts discovered through mDNS will automatically be added to the list of known casts
mDNS and known-host based discovery will work in parallell
If devices are down when HA starts, they will not block startup and will appear once reachable
Audio groups will also be discovered

This obviously means that it's necessary to assign static IPs to the cast devices

Sample configuration:

cast:
  known_hosts:
    - 192.168.0.10
    - 192.168.0.11

Diff of the HA changes: https://github.com/home-assistant/core/compare/dev...emontnemery:cast_known_hosts_tmp

Would someone be able to give this a try?

Joikast commented 3 years ago

I got tired of this, so I tried another thing. I did a passthrough of one of the NICs in my physical server, and added it directly to the HA VM as a pci adapter.

I then configured it as a normal access port on my switch in the other end.

It worked directly after doing this. So I think we I have really narrowed it down to the vSwitch or the virtual NICs in vmware. I am running esxi 6.5 by the way,

Don´t know if this helps anyone, since the root cause is still not solved. However, I can have this as a permanent fix.

MarkHofmann11 commented 3 years ago

You can only do this on ESXi 6.5 and greater, but did you try this: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.networking.doc/GUID-50AF72D4-F766-4D68-8330-BA15250E4E99.html

Enabling Multicast filtering mode: IGMP/MDL snooping on the vDS. I did this on my setup which also seemed to help stability.

@emontnemery - I do like having the ability to use static setups for the cast devices if that isn't too much trouble.

One other note that I ran into recently. I am using the integration where it auto discovers the cast devices, but I also still had this in my configuration.yaml file:

#cast:
#  media_player:
## Master Bedroom Mini
#    - uuid: "45a49e87-576d-05f6-a607-5b298df0beb3"
## Mark's Office Hub
#    - uuid: "4b38c2cd-2875-83c7-9276-5020d9f885ba"
## Family Room Mini
#    - uuid: "7176096b-3677-602b-9511-09cab03b50f6"
## Alex's Room Mini
#    - uuid: "90c4999a-6d4f-c44d-01dd-6c486f1f86f3"
## Foyer Mini
#    - uuid: "74c3b4c2-1831-492a-3178-662bd79e628f"
## Family Room Mini
#    - uuid: "7176096b-3677-602b-9511-09cab03b50f6"
## Basement Mini
#    - uuid: "98bd955b-7357-36ad-3ed4-a8c217e8d5f8"
## Shield-Alex-Room-TV
#    - uuid: "50c04653-7aca-be14-a8c6-f5ea1e089728"
## Shield-Living-Room-TV
#    - uuid: "64821d72-aea7-0bac-88ae-6ed23c43e30d"
## Shield-Family-Room-TV
#    - uuid: "13416b3c-72d4-94ee-f7a4-875c0a77f0bb"
## Shield-Bedroom-TV
#    - uuid: "283e8a7a-c0bf-48db-a7f7-5965835f7b9a"

This was causing issues with the Lovelace UI locking up - only while casting. The entire Lovelace interface would hang - but it only happened while casting was active. I figured it was due to being in two places (integration and configuration.yaml), so I removed it from the configuration.yaml and have not seen the issue since.

I'd be willing to test, but not sure how to pickup your modifications. If there is a ZIP archive I can extract and run as a custom component for testing, I'd be happy to do it.

emontnemery commented 3 years ago

I do like having the ability to use static setups for the cast devices if that isn't too much trouble

Sure, it should be possible to do something like:

cast:
  known_hosts:
    - 192.168.0.10
    - 192.168.0.11
  media_player:
    - uuid: "45a49e87-576d-05f6-a607-5b298df0beb3"
    - uuid: "4b38c2cd-2875-83c7-9276-5020d9f885ba"

The reason why the known_hosts don't point out individual media_player entities is that one host can have the device itself + audio groups, which are passed around between the devices depending on which device is online.

The entire Lovelace interface would hang - but it only happened while casting was active. I figured it was due to being in two places (integration and configuration.yaml), so I removed it from the configuration.yaml and have not seen the issue since.

The purpose of uuid is to make it possible to declare a list of allowed cast devices even when the auto discovered integration is there, I'll do some testing on my side to try to reproduce it. When you say it only happened while casting was active, does that mean it would lock up whenever one of the devices were active, or did you initiate the casting from Home Assistant?

MarkHofmann11 commented 3 years ago

If there is a possibility to use the "known_hosts" - that would solve the issues that some are having (and I used to have) where their particular setup wasn't working properly with mDNS. I only added the uuid stuff as a work-around originally when I was having mDNS issues, too. Since that is no longer the case, I recently commented the uuid information out.

Would someone need to use both known_hosts and uuid if they didn't have a working mDNS setup or could you pick one or the other? Based on what you mentioned, it sounds like both would be needed if you don't have a working mDNS setup for whatever reason.

To answer your question - yes, the Lovelace hanging in random spots would only happen while one of my Google mini's was actively casting. I have an automation that plays a local whitenoise MP3 file at night, and when that would playing via local URL from HA, Lovelace was hanging and freezing only while it was playing.

I got around that my commenting out the UUID information from configuration.yaml since it really isn't needed in my setup anymore now that mDNS is working.

N5A commented 3 years ago

And... it went south again. I'll have to look at it this coming week to see what wireshark can see on the Win10 host's untagged and tagged network links.

Casts yet again unavailable. Can google stop messing up your integration?

N5A commented 3 years ago

I read back through the posts since my last... IPv6 indeed seems to get in the way. I disabled it on all the win10 pc's network connections. Wifi, Virtual LAN cards, real lan card, and the Vbox Virtual adapters. The casts immediately returned to operation.

emontnemery commented 3 years ago

Would someone need to use both known_hosts and uuid if they didn't have a working mDNS setup or could you pick one or the other? Based on what you mentioned, it sounds like both would be needed if you don't have a working mDNS setup for whatever reason.

No, it's not necessary to use both, the options are completely independent:

known_hosts defines a list of hosts / IP-addresses which will be polled for cast devices + audio groups
uuid allows you to define an optional list of allowed cast devices or audio groups, identified by UUID. Casts discovered through mDNS or the known_hosts polling will only be added if they're listed. If there is no UUID list, all casts found will be added.

To answer your question - yes, the Lovelace hanging in random spots would only happen while one of my Google mini's was actively casting. I have an automation that plays a local whitenoise MP3 file at night, and when that would playing via local URL from HA, Lovelace was hanging and freezing only while it was playing.

Unfortunately I can't reproduce the problem, would you mind trying it again?

MarkHofmann11 commented 3 years ago

That is good news - I would rather just define them as known_hosts via static IPs. I use DHCP reservations for all my smart home devices, so that would be easy.

This is the automation that while running (with the static UUID parameters defined + the integration setup) - causes Lovelace to not act correctly: (all I did was comment out the UUID info - which I really didn't need since mDNS works here now)

- id: Whitenoise_Alex_turn_on_at_night_930pm
  alias: Whitenoise Alex turn on at night 9:30pm
  initial_state: 'on'
  trigger:
    platform: time
    at: '21:30:00'
  mode: single
  action:
  - service: media_player.volume_set
    data_template:
      entity_id:
      - media_player.alexs_room_mini
      volume_level: '0.01'
  - service: media_player.turn_off
    data_template:
      entity_id:
      - media_player.alexs_room_mini
  - delay:
      seconds: 1
  - service: media_player.turn_on
    data_template:
      entity_id:
      - media_player.alexs_room_mini
  - delay:
      seconds: 1
  - service: media_player.play_media
    data:
      entity_id:
      - media_player.alexs_room_mini
    data_template:
      media_content_id: http://192.168.0.14:8123/local/White_Noise_10hrs.mp3
      media_content_type: music
  - service: media_player.volume_set
    data_template:
      entity_id:
      - media_player.alexs_room_mini
      volume_level: '0.40'

Now I do have a template sensor to show the status of the whitenoise here in configuration.yaml:

  - platform: template
    switches:
      whitenoise_alex_room:
        friendly_name: "Whitenoise - Alex Room"
        value_template: "{{ is_state_attr('media_player.alexs_room_mini', 'media_content_id', 'http://192.168.0.14:8123/local/White_Noise_10hrs.mp3') and is_state('media_player.alexs_room_mini', 'playing') }}"
        turn_on:
          service: input_boolean.turn_on
          data:
            entity_id: input_boolean.whitenoise_alex_room
        turn_off:
          service: input_boolean.turn_off
          data:
            entity_id: input_boolean.whitenoise_alex_room
        icon_template: >-
          {% if is_state_attr('media_player.alexs_room_mini', 'media_content_id', 'http://192.168.0.14:8123/local/White_Noise_10hrs.mp3') and is_state('media_player.alexs_room_mini', 'playing')  %}
            mdi:bell-sleep
          {% else %}
            mdi:bell-sleep
          {% endif %}

home-assistant / core