Multicast container causing feedback loops

jesserockz commented 4 years ago

Since getting up this morning, devices on my network could not get an IP address. I notice on my server that hassio_supervisor and hassio_multicast were created 8 hours ago.

Looking at my router there was a constant TX rate of about 8Mbps to my IoT and Guest vlans coming from my lan vlan.

No device on my network could get a DHCP reservation unless I stopped the hassio_multicast container which also halted the extra traffic transferring between the vlans.

Of course hassio_supervisor just starts the multicast container back up again and the issue is back.

I have had to tag a local dummy empty docker image with homeassistant/amd64-hassio-multicast:2 to resolve this for now.

pvizeli commented 4 years ago

I can't reproduce that issue. Maybe you have also other mdns repeater running and end up in a loop?

You can use ha multicast logs to track issue from where the traffics is comming. Our CoreDNS plugin just ask every 2 minutes with a handfull packages on the multicast address for devices.

frenck commented 4 years ago

I've checked this as well and cannot see this in my network (using multiple VLANs and additional repeaters between those as well).

jesserockz commented 4 years ago

From ha multicast logs Screen recording

Repeating over and over again

data from=10.5.5.1 size=38
repeating data to enp5s0
repeating data to hassio
data from=10.2.2.1 size=38
repeating data to enp5s0
repeating data to hassio

jesserockz commented 4 years ago

Looks also like supervisor recreated the container with the original image this morning. I found all of my IoT devices offline because they could not get an IP address lease again.

pvizeli commented 4 years ago

You should check this IP address. This plugin does not more than just copy the multicast broadcast for mDNS over to the hassio network. Also, DHCP is working on Broadcast, not Multicast. That is a pretty common process that is used on many firewalls/gateways.

You need to fix your Network issue or just run Home Assistant Core. The Supervisor will still try to fix the plugin every time. It could be also a Kernel issue on the system in which you run the Supervisor. Give us more details about your host.

jesserockz commented 4 years ago

Those IP addresses are the adapters on the router, 10.5.5.1 for IoT network, 10.2.2.1 for Guest. Should it be repeating to enp5s0 then? Which is my host ethernet port.

Router is Ubiquiti Edgerouter X

Host info:

$ uname -a
Linux feijoa 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64 GNU/Linux

$ ip r
default via 10.1.1.1 dev enp5s0
10.1.1.0/24 dev enp5s0 proto kernel scope link src 10.1.1.44
10.2.2.0/24 dev enp5s0.2 proto kernel scope link src 10.2.2.44
10.5.5.0/24 dev enp5s0.5 proto kernel scope link src 10.5.5.44
10.100.100.0/24 dev zt0 proto kernel scope link src 10.100.100.44
169.254.0.0/16 dev enp5s0 scope link metric 1000
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.18.0.0/16 dev br-1a6738faed7e proto kernel scope link src 172.18.0.1
172.30.32.0/23 dev hassio proto kernel scope link src 172.30.32.1

Let me know if you want any specific details. Thanks for helping.

jesserockz commented 4 years ago

Ok, so just checked my router, it has an mdns repeater as well turn on for my vlans. If I turn it off I can start up the multicast container and everything is fine. But this is not a solution. I now will not have mdns across my vlans.

I think its definitely gotten into a loop with this mdns-repeater but like you mentioned, this one should only be repeating the data into the hassio network, but it is clearly sending it back out to the vlan.

mattburchett commented 4 years ago

Multicast is also getting stuck in a loop against mdns-repeater in my network (separated LAN, IOT network, and guest VLAN), causing issues with my network as a whole.

data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=90
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio

This docker log is also running my server out of space by how fast it's flowing.

if relevant, mdns-repeater is running of a Ubiquiti EdgeRouter X.

jesserockz commented 4 years ago

Multicast is also getting stuck in a loop against mdns-repeater in my network (separated LAN, IOT network, and guest VLAN), causing issues with my network as a whole.
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=90
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
data from=10.10.1.1 size=75
repeating data to eth0
repeating data to hassio
This docker log is also running my server out of space by how fast it's flowing.

if relevant, mdns-repeater is running of a Ubiquiti EdgeRouter X.

So exact same situation and setup as I have. I disabled the repeater on the router for now as we don't have control of which services the supervisor runs.

kernehed commented 4 years ago

I had the same issue as you. I did a dirty downgrade. But now when i'm trying to update HA it gives me errors: 20-04-22 10:42:03 INFO (MainThread) [supervisor.plugins.multicast] Start Multicast plugin 20-04-22 10:42:03 ERROR (SyncWorker_13) [supervisor.docker] Can't create container from hassio_multicast: 404 Client Error: Not Found ("No such image: homeassistant/amd64-hassio-multicast:2") 20-04-22 10:42:03 ERROR (MainThread) [supervisor.plugins.multicast] Can't start Multicast plugi 20-04-22 10:42:19 ERROR (SyncWorker_10) [supervisor.docker.interface] Can't install homeassistant/qemux86-64-homeassistant:0.108.7 -> 500 Server Error: Internal Server Error ("Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"). 20-04-22 10:42:19 WARNING (MainThread) [supervisor.homeassistant] Update Home Assistant image fails

Edit: Managed to fix the error of multicast plugin with a reinstall of supervisor. Now my network is slow again and the plugin eats up my diskspace.

shexbeer commented 4 years ago

Same issue as you guys. Deleting this log every second day, but my 32 gig sd card is full after 24 hours again at least. Running similar network configuration. Very annoying currently. Its also eating up a alot of ressources on the machine. Like avahi-daemon and mdns-repeater are going nuts

If multicast would be casting not only from wlan0/eth0 to hassio

repeating data to wlan0
repeating data to hassio

but instead do multicast between eth0 wlan0 and hassio, that would be awesome. So i wouldnt need my own solution and therefor dont have the issues i have currently with that hassio mulitcast setup Or let me disable your mdns-repeater ;)

cyr-ius commented 4 years ago

Same issue.

kernehed commented 4 years ago

When I downgrade to multicast_hassio:1 my network works as usual. Is it somehow possible to have it downgraded without the wathcdog goes bananas?

After the downgrade I can't update HA or Supervisor.

This is the log. I have Mikrotik router and Unifi APs. Installed in docker on Ubuntu 18.04.

data from=192.168.2.200 size=167
repeating data to ens160
repeating data to hassio
data from=192.168.2.200 size=110
repeating data to ens160
repeating data to hassio

jslove commented 4 years ago

Same issue for me, using unifi gateway and switches with multiple vlans

Edit: I’m not seeing a DHCP issue, my problem is the massive logfiles filling up my drive. I’m getting 500M/hour in the log. I have had to soft link the container’s log to /dev/null.

jesserockz commented 4 years ago

I have been looking into the mdns-repeater that you are using. For configuration you specify 2 interfaces, in my case enp5s0 and hassio. But the server listens on every interface on the host machine, and repeats everything coming in on any interface out to the specified interfaces.

Options maybe:

Allow the users to disable this plugin
Allow configuration of the -b blacklist parameters to allow the users to tell mdns-repeater to ignore the other vlan packets if they are already being repeated to the main interface (below)
Update mdns-repeater to only listen on the specified interfaces
Find/write a new repeater

I turned on my router mdns-repeater and ran the multicast container with a bash entrypoint then ran mdns-repeater -f -b 10.2.2.0/24 -b 10.5.5.0/24 enp5s0 hassio This successfully repeats the correct packets and there are no loops.

data from=10.1.1.164 size=38
repeating data to hassio
skipping packet from=10.2.2.1
skipping packet from=10.5.5.1
data from=10.1.1.134 size=38
repeating data to hassio
skipping packet from=10.2.2.1
skipping packet from=10.5.5.1
data from=10.1.1.164 size=38
repeating data to hassio
skipping packet from=10.2.2.1
skipping packet from=10.5.5.1

I think the users who have this issue are the ones with vlans set up and have more knowledge about the networks so having the -b 10.2.2.0/24 -b 10.5.5.0/24 as an optional configuration option somewhere could be the quickest solution?

Thoughts? @pvizeli @frenck

mbrackjr commented 4 years ago

Awaiting a structural solution (which is really needed as mdns-repeater should NOT be listening on all ports as it results in unwanted mdns-announcements across vlans) i've added a static route to 8.8.8.8/32 towards my loopback interface. That way the hassio-mcast container forwards mdns-announcements on all hassio host interface to the docker-network (which is fine with me) and the hosts' loopback (which is wrong, but doesn't hurt too much at the moment).

chisale commented 4 years ago

I'm glad I found this post. I'm also having same issue. Initially, I stated noticing my hard drive writing logs non-stop and experiencing chromecast problem. I also run UniFi equipments with multiple vlans and mDNS enabled in the controller. At least, I found some workaround for the time being. Thanks.

mbrackjr commented 4 years ago

@frenck @pvizeli Did you had any chance in looking further into this issue? Jesserockz identified the root-cause and potential solution, so would be very grateful if this can be structurally resolved. Thanks for your great work.

mkarnebeek commented 4 years ago

My issue was caused by having another reflector (avahi) on the network across the same two interfaces as hassio. Since I had that reflector now properly configured, there was no longer a need for hassio to have two interfaces: I removed hassio's second interface and configured my router to allow traffic from hassio to the other network. The router and my own set up reflector are now all that is needed for hassio to reach the second network (dedicated iot network for poorly secured devices, no internet access for example, and a few hosts, like hassio allowed to access that network).

I no longer have issues, so that's maybe worth something. For me it feels more properly set up this way.

jesserockz commented 4 years ago

That will not work for all cases though. Simple example: My 10.2.2.X network is a guest wifi. Guest have access to HA to control certain devices but if I then remove the HA machine from that network they will no longer have access. I just don't think HA should not be the architect of my network layout.

mbrackjr commented 4 years ago

Agree with Jesserockz; I absolutely love HA, but the forced push of this multicast repeater without any form of either disabling it nor customizing it is a wrong move imho. I'm the first to admit my situation is a-typical and actually unwanted, but at the moment I'm forced to have a second network exposed within my HA-running machine.

I understand, and fully respect, that the maintainers can't nor want to support every single possible scenario HA is run in/on, but every forced functionality should be accompanied by a means to interact/influence it, even if its just disabling it (and accepting the corresponding consequences).

I hope @frenck or @pvizeli can respond on their thoughts soon. I'm not a programmer (network guy for profession) so don't feel comfortable to write a pull-request by myself. If someone however can point me in the right direction how I can either define a user-facing option in HA or an admin-facing option on CLI, I'm willing to do an attempt.

frenck commented 4 years ago

@mbrackjr If you like more control, please use Home Assistant Container instead.

mbrackjr commented 4 years ago

@frenck Thanks for responding. I was somehow afraid to get this type of feedback based on similar comments when the manual hassio install method was originally pulled.

I don't want to end up in a discussion of right or wrong; you're one of the primary maintainers of this wonderful project. So if this is the formal statement, I have no option than to 'live with it'.

I do believe however I'm representative of a small percentage of still very happy users of hass.io that is trying to be positively critical in making the product even better. Your own 'about me' seems to me you're self-critical as well, so I sincerely hope you can value my feedback here, rather than dismissing it.

I'm perfectly fine for supervisor to manage my HA stuff and it hasn't given me any problems so far. This multicast-container has, simply because its intended use (get multicast network announcements into HA in support of service-discovery) is not the provided use (repeating all inbound multicast to all other interfaces). The fact it works 'for most', is simply due to the fact that 'most' installations of "Home Assistant Supervised" are using a single network interface. For the outliers/exceptions the solution is not fit-for-purpose and that should not be considered 'good enough' for anyone serious about providing qualitative software, which I believe you are (given your extremely well documented add-ons of the past).

Again, I hope this comes across with its intent; to point out a flaw in the intended use of the provided solution, hoping you're able to help out (either by programming or providing some guidance). Not as negative or critical to your intentions.

Thanks for listening/reading. Manfred

frenck commented 4 years ago

@mbrackjr We are always open for PRs.

I'm sorry that you don't like the response, however, starting with "but the forced push of this multicast repeater without any form of either disabling" is fine, but just saying if you want manual control, this isn't the right system for you.

Samantha-uk commented 4 years ago

I've been seeing the same. Looking at the various source files, it seems that the underlying mDNS process (Which is https://github.com/kennylevinsen/mdns-repeater/releases/tag/1.11) is started by Home Assistant in 'foreground' mode. " -f runs in foreground for debugging"

I rebuilt the docker image and injected it into my Home Assistant setup. (A torturous process lol) and ...

Logging output is much reduced
Home assistant seems to carry on working without logging any warnings/errors

I was about to make a PR but I see that @pvizeli has made some changes whilst I was 'messing around' lol.

Samantha-uk commented 4 years ago

Thanks @pvizeli for those changes. I pulled them down and 'injected' them into my Home assistant setup and the noisy log is no more! :)

pierre2113 commented 4 years ago

I had this issue about 1-2 weeks ago, on the raspberry pi it was armv7-hassio-multicast docker container consuming most of the cpu, and it caused avahi-daemon service on my ASUS router to consume 100% cpu, essentially it brought my whole network down. I got quick relief by running "service stop_mdns" to shutdown avahi-daemon on the ASUS router. this caused my other Access points no longer functional, but at least I got internet up and running to research.

I accidentally solved the problem by updating raspbian buster, I have supervised Home-assistant installed, essentially I have a flavor of debian Raspbian (buster) as the Operating system on the raspberry pie 3b+, running hassio on top. I have the same home-assistant benefits as the hassos but I can still login to the root operating system, and install other utilities.

all i had to do was run apt-get update apt-get upgrade

then reboot raspberry pi 3b+.

I neglected updating Raspbian buster since I installed it, but I've been upgrading Home Assistant all the way 0.114. my guess is somehow latest home-assistant required latest version of packages in Raspbian as well.

I still had some HA freeze issues, that turned out to be HACS Samsung TV custom integration, I just uninstalled it, HA no longer just froze. I've been running 1 week now with 0 issues, actually cpu usage on the raspberry pi is down. I must have had other minor issues consuming cpu on the raspberry pi that were also related to not upgrading Raspbian Buster, but everything still worked so I didn't notice too much.

fervox commented 4 years ago

I've been struggling for a few months with hassio-multicast causing 100% cpu usage with mdns-repeater, running hassio-supervisor under Raspian Pi OS. This also takes down my network (Unifi) which is incredibly frustrating. If the whole network doesn't get taken out, DNS resolution slows to a crawl as pihole (running on the same Pi, but not under hassio-supervisor) can't resolve addresses due to CPU resource constraints.

My temporary fix was to find the container and image and remove them:

sudo docker image ls | grep multicast -- gives image ID sudo docker rmi -f [image ID] -- this gives and error with the container ID that you need to remove sudo docker rm -f [container ID] sudo docker rmi -f [image ID]

I'm sure there's a better way of doing that but it fixes it for a day or so. It might be helpful to someone with the same problem.

Unfortunately, hassio-supervisor works out that hassio-multicast container is offline and reanimates it -- within minutes if you only kill the container but it takes hours - days to recover if you remove the image. I think it must redownload the image after it was force removed. Whilst hassio-multicast is not available hassio seems to work fine, so I'd really like to know how I can neuter it for good. I can't go on manually killing this container indefinitely whenever my local network goes down.

I recently tried another approach which seemed promising: sudo docker pull homeassistant/armv7-hassio-multicast:1 -- this seems to pull down an old version which doesn't work

Hassio supervisor logs show the following after doing this:

20-09-19 02:26:18 ERROR (MainThread) [supervisor.misc.tasks] Watchdog Multicast reanimation failed! 20-09-19 02:27:18 WARNING (MainThread) [supervisor.misc.tasks] Watchdog found a problem with Multicast plugin! 20-09-19 02:27:18 INFO (MainThread) [supervisor.plugins.multicast] Start Multicast plugin 20-09-19 02:27:18 ERROR (SyncWorker_2) [supervisor.docker] Image homeassistant/armv7-hassio-multicast not exists for hassio_multicast 20-09-19 02:27:18 ERROR (MainThread) [supervisor.plugins.multicast] Can't start Multicast plugin 20-09-19 02:27:18 ERROR (MainThread) [supervisor.misc.tasks] Watchdog Multicast reanimation failed!

20-09-19 02:28:18 WARNING (MainThread) [supervisor.misc.tasks] Watchdog found a problem with Multicast plugin! [repeats every minute]

Unfortunately, however, my network was down this morning and the very persistent supervisor seems to have re-downloaded the newer version which was consuming 100% CPU. Very frustrating.

Does anyone have any other ideas how to workaround hassio-multicast reanimation? I've tried to find a way to block it downloading any version but without success. Perhaps I need a script that automatically kills the container and removes the image every minute?

My impression from this thread is that there must be a large number of users with similar rPi / network configs. Also, I saw that support for this supervised install on Raspbian Pi OS will get ongoing support due to an unexpected number of users. Wouldn't it be worthwhile working out a proper fix for this issue? I don't have enough experience with docker to work out how to get Home Assistant Core installed in docker (without supervisor) and then setup the 6 add-ons I have running on my network.

pierre2113 commented 4 years ago

just out of curiosity what version of docker do you have installed on Raspbian Pi OS when I ran the upgrade command, it also upgraded my docker-ce and docker-ce-cli to version 5:19.03.12~3-0~raspbian-buster. I'm starting to wonder, if this could be a docker bug, just a hunch no proof. Docker does a lot of network routing on containers.

mbrackjr commented 4 years ago

For those running multiple interfaces and another device (usually your central router) running a mdns repeater of its own, the hassio multicast container wrongfully forwards any mdns announcements received on any non-standard interface to the primary interface (in my case inbound mdns on vlan90 gets repeated outbound on interface enp1s0, which causes a mdns announcement loop as my central router repeats mdns between those subnets as well) .

I've worked around this bug by blackholing 8.8.8.8/32 into a loopback-interface, so the multicast container only forwards (repeats) mdns announcements from outside into the docker-environment (where you'd want this for things like discovery), and this bogus loopback interface. Again, this is a workaround, so your mileage may vary. If you rely on Google DNS as your primary DNS, you might have to change your DNS resolver config, as this will obviously conflict then.

I'm running my hassio deployment on a straightforward Ubuntu Server install, so i added below to my "/etc/netplan/01-netcfg.yaml" to get this static route in place. For other Linux distributions, you might have to apply a different method, but the logic is the same.

network:
  version: 2
  renderer: networkd
  ethernets:
    lo:
      match:
        name: lo
      addresses: [127.0.0.2/32]
      routes:
        - to: 8.8.8.8/32
          via: 127.0.0.2
    enp1s0:
      dhcp4: no
      dhcp6: no
      addresses: [192.168.0.xxx/24]
      gateway4: 192.168.0.xxx
      nameservers:
        addresses: [192.168.0.xxx]
  vlans:
    vlan90:
      id: 90
      link: enp1s0
      dhcp4: no
      addresses: [192.168.90.xxx/24]

jesserockz commented 4 years ago

I am actually wondering why this plugin is required in the first place when the homeassistant container attaches to the host network directly, therefore giving it access to the mDNS packets anyway.

pierre2113 commented 4 years ago

after I ran package upgrades to my raspbian, I was no longer getting network down issues, but my HA does freeze about once a week with zeroconf errors.

after researching for the 2nd (zeroconf) error, some forums suggested add this config to my configuration.yaml its too early to say its fixed it, I have to wait another 2 weeks to be sure.

zeroconf: default_interface: true

the explanation of what it does is below, could the mDNS also have something to do with this setting, so far zeroconf still works, with no noticeable side effect, but jury still out on my weekly HA freezes.

https://www.home-assistant.io/integrations/zeroconf/

default_interface boolean(optional, default: false)

By default, zeroconf will attempt to bind to all interfaces. For systems running using network isolation or similar, this may result in zeroconf being unavailable. Change this option to true if zeroconf does not function.

fervox commented 4 years ago

just out of curiosity what version of docker do you have installed on Raspbian Pi OS when I ran the upgrade command, it also upgraded my docker-ce and docker-ce-cli to version 5:19.03.12~3-0~raspbian-buster. I'm starting to wonder, if this could be a docker bug, just a hunch no proof. Docker does a lot of network routing on containers.

I did upgrade to the latest version of Raspbian Pi OS (sudo apt get update, sudo apt get dist-upgrade) but it didn't help me. I'm running docker version: 19.03.13

pierre2113 commented 4 years ago

just out of curiosity what version of docker do you have installed on Raspbian Pi OS when I ran the upgrade command, it also upgraded my docker-ce and docker-ce-cli to version 5:19.03.12~3-0~raspbian-buster. I'm starting to wonder, if this could be a docker bug, just a hunch no proof. Docker does a lot of network routing on containers.

I did upgrade to the latest version of Raspbian Pi OS (sudo apt get update, sudo apt get dist-upgrade) but it didn't help me. I'm running docker version: 19.03.13

try this in you configuration.yaml

zeroconf:
  default_interface: true

if mbrackjr is right about "hassio multicast container wrongfully forwards any mdns announcements received on any non-standard interface to the primary interface" maybe this configuration change that sounds like it will make home assistant bind to 1 interface will also solve the problem.

I don't know how to make the changes mbrackjr mentioned, that file doesn't exist in my raspbian buster.

fervox commented 4 years ago

Just wanted to update that since 9 days ago I haven't had any further issues with multicast reanimating. I believe killing the container, removing the image and replacing it with an old version (sudo docker pull homeassistant/armv7-hassio-multicast:1) is enough to block supervisor from relaunching it. I don't know why it came back within 24 hours for me initially but I suspect it might have been because of a supervisor upgrade? Since I did it again my problems are solved. If I only have to do this every time supervisor updates I can probably live with this showing up in the logs.

20-09-28 04:40:57 WARNING (MainThread) [supervisor.misc.tasks] Watchdog found a problem with Multicast plugin! 20-09-28 04:40:57 INFO (MainThread) [supervisor.plugins.multicast] Start Multicast plugin 20-09-28 04:40:57 ERROR (SyncWorker_6) [supervisor.docker] Image homeassistant/armv7-hassio-multicast not exists for hassio_multicast 20-09-28 04:40:57 ERROR (MainThread) [supervisor.plugins.multicast] Can't start Multicast plugin 20-09-28 04:40:57 ERROR (MainThread) [supervisor.misc.tasks] Watchdog Multicast reanimation failed!

It is actually strangely satisfying to read that is supervisor is doggedly trying to beat my workaround with such drama (!), and failing to do so. This probably speaks to the grief that this problem has caused me :)

Thanks @pierre2113 I also didn't know how to translate mbrackjr's network config to Raspberry Pi OS (formerly Raspbian) but it looked like an interesting avenue - perhaps someone else knows how to do it?

I will try the zerconf setting next time it happens to see if that helps. Has anyone else tried changing the zeroconf config or blackholing 8.8.8.8?

pessorrusso commented 3 years ago

The zeroconf settings does not fix the issue.

zeroconf:
  default_interface: true

I am still trying to figure out how to disable the ha to work as a mdns repeater for my network

pessorrusso commented 3 years ago

I could disable repeated messages from reaching my network by defining a drop in iptables output for mdns packages (iptables -I OUTPUT -p udp --dport 5353 -j DROP) the challenge was that the OS file system is read only so you cannot simply add this command to any systemd service. The workaround is creating a docker container with root access that will add this setting to the host.

Dockerfile

FROM ubuntu
RUN apt-get update && apt-get install -y iptables
CMD iptables -I OUTPUT -p udp --dport 5353 -j DROP && docker stop iptablefix

Build

docker build -t myimage/iptablefix .

Run

docker run --name iptablefix --restart always --privileged --net=host --pid=host --ipc=host -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker myimage/iptablefix

Note that the restart policy is always but the container will stop itself so it will run just one during the boot.

Hope it helps. Cheers Sam

fervox commented 3 years ago

Thanks Sam. I had modest success with my approach of replacing the image with an older one that caused an error and therefore failed to launch the multicast docker image. I improved it a bit by making it a one step command: sudo docker container rm -f $(sudo docker ps -aqf "name=hassio_multicast") && sudo docker image rm -f 0fbd536e8547 && sudo docker pull homeassistant/armv7-hassio-multicast:1

Downside was that every fortnight or so supervisor managed to install the homeassistant/armv7-hassio-multicast:3 image and it would start working again, breaking my network. You knew as soon as the internet slowed to a crawl and devices reported name collisions that this had happened again.

However, I've just tried your approach and it seems to be working. Great. A few extra notes re setting it up for my context, in case it helps anyone else.

Presumably because I am running raspbian on a RPi4 I got an error during docker build which was fixed by installing a newer version of libseccomp2:

wget http://ftp.us.debian.org/debian/pool/main/libs/libseccomp/libseccomp2_2.5.1-1_armhf.deb sudo dpkg -i ./libseccomp2_2.5.1-1_armhf.deb

I wonder if it'd be easy to make a docker image that makes this a two step pull and run process for anyone else having this issue?

intermittech commented 3 years ago

Hey people, I've been batteling this "multicast storm" issue for a few years now, especially when there would be 2x mDNS deamons running on my network in different subnets, such as a production and test Home Assistant installation.

Yesterday I finally found the issue, Mikrotik has a feature on their Routers/Switches/etc. in their bridge configuration called "Unknown Multicast Flood". Turn that off and I was able to run Home Assistant (with mDNS deamons) on 3 different subnets without any issues anymore!

I wrote a little article about it here: https://blog.quindorian.org/2021/01/home-assistant-mikrotik-multicast-storm.html/

Let me know if it helps in your cases too!

borski commented 3 years ago

I’ve also been having this issue for a few months and it’s infuriating. What is the best current workaround for this? (On unifi equipment)

pessorrusso commented 3 years ago

I am also on Unifi (and EdgeRouter), my workaround fixed the issue for me, see my previous comment (Jan 25).

wittekind commented 3 years ago

I've approached this in a different way. As long as HA is still able to find or address devices in other subnets / V-LANs it does not need to be multihomed.

I'll describe my specific situation. While it might not help everyone, it might help some:

I have two major subnets, both on different V-LANs created on UniFi equipment. One is my "home" network, the other is for IoT devices. My control devices (smartphones, computers etc) live on the home network, Chromecast Audios and other renderers live on the IoT network. With HA having interfaces on both network I saw the flooding everyone has issues with. USG does not take kindly to that. But with HA only being present in either one of the network I need the USGs mDNS repeater to make the Chromecasts available in the home network - which is unreliable in itself. But that a USG issue, or rather a Google issue. So the idea of having a reliable mDNS repeater is good - it's just that HA is not that.

Solution: I moved HA to the home network and found a different mDNS repeater: https://github.com/scyto/multicast-relay It's deployed as Docker container on a Raspberry Pi 4. My network interfaces on that machine are setup a little more complicated, but it works well: There's a macvlan adapter set up on the host for each VLAN. I've created a Docker network with macvlan driver for each VLAN as well. The mDNS repeater is attached to each of the VLANs. If you want to do the same the container needs the names of the interfaces for which you want the repeater to be active. For the attached Docker networks it just counts eth0 upwards, so I've set INTERFACES=eth0 eth1 for my two networks.

I hope this helps someone.

Edit: This solves one problem with Chromecast Audio in particular: Groups. Those usually fall apart across subnets when only using Unifis repeater. I can now use those across subnets.

blhoward2 commented 3 years ago

I experienced this same issue with openwrt and disabling ipv6 on my router fixed it.

pierre2113 commented 3 years ago

I already have ipv6 turned off, but I also have different home network setup. My 2 raspberry pi running supervised Home assistant were connected to an access point. for a short time I had to turn off MDNS on all the routers to prevent network outage. After a few changes and HA upgrades I turned MDNS back on all the routers without my network going down, however I started having video streaming issues, video resolution would drop periodically on FireTV Cube but only if the FireTV cube was plugged in to the same Access Point router that the 2 raspberry PI's running HA were connected to. My internet plan was 1Gbit, so it couldn't have been slow internet. I ended up creating a VLAN on the Access point router, and use EBTABLES command to isolate FireTV Cube LAN connection from the 2 HA Lan connections. just for caution I also blocked Multicast/MDNS coming from HA to the FireTV with EBTABLES command. I didn't know how to implement mbrackjr suggested solution to my raspbian os, due to lack of knowledge.

spudje commented 2 years ago

Having same issue. Restarting my USG after restarting Home Assistant seems to take the load away. However I do hope we'll see a structural solution soon.

OK I dived a bit more into this issue. My setup is a 100% Unifi based network, with USG configured to repeat mDNS. I have a Home Assistant Blue device, on which I also run the Unifi controller. However I have it connected to the LAN over 2 VLANs. The IoT VLAN for Home Assistant and the untagged/managed VLAN for the Unifi Controller. I have a third "main" VLAN for PCs, NAS, Mobile phones. This VLAN is not tagged on the network port where the Home Assistant Blue is connected. Still all Chromecast stuff works fine, so I take it my USG does a proper job in itself with mDNS repeating between IoT and main VLAN.

Since the untagged VLAN is not needed for any of the Home Assistant stuff, can I somehow drop everything that the Home Assistant Blue copies on there when it's coming towards my USG (A LAN-out drop rule????) and will that stop my USG from have high CPU load?

Or should I simply remove the untagged VLAN from the interface/network port with Home Assistant Blue. Will the Unfii controller still see my network devices then?

spudje commented 2 years ago

Well, limiting my Hass Blue to only 1 VLAN does not fix the issue. So probably I don't understand the issue.

joshuaspence commented 2 years ago

Would the team to open to a pull request to allow this container to be configured and/or disabled?

joshuaspence commented 2 years ago

@jesserockz did you ever find a solution/workaround for your problem?

jesserockz commented 2 years ago

@joshuaspence no. I have since moved my install to a Blue so it does not apply to be personally anymore.

But basically you cannot use multiple network interfaces (or vlans) on the host running supervised HA because of this issue.

joshuaspence commented 2 years ago

Ok, fair enough. If the maintainers (@frenck / @pvizeli) would be open to a pull request to fix/improve this behaviour then I am willing to work on it.

home-assistant / plugin-multicast

Multicast container causing feedback loops #1