home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.73k stars 30.46k forks source link

WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39) #127121

Open richard-c-howlett opened 3 weeks ago

richard-c-howlett commented 3 weeks ago

The problem

I am getting scapy.runtime errors in my home-assistant.log file almost every day. The block of 3 lines is repeated every 5 seconds and HA becomes inaccessible. Below is 2 of the blocks of error lines.

2024-09-30 13:01:35.679 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39) 2024-09-30 13:01:35.693 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39) 2024-09-30 13:01:35.693 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 57 (0x39) 2024-09-30 13:01:40.692 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39) 2024-09-30 13:01:40.693 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39) 2024-09-30 13:01:40.693 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 57 (0x39) ...... repeated thousands of times

The only way to stop it is to stop and restart the HA container. I do not know what is causing it and I can't find anything on the web similar. I cannot pin down when these started happening I'm afraid. There are no seemingly related messages before these. They just randomly start getting logged.

What version of Home Assistant Core has the issue?

2024.9.2

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

Unknown

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2024-09-30 13:01:35.679 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39)
2024-09-30 13:01:35.693 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39)
2024-09-30 13:01:35.693 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 57 (0x39)
2024-09-30 13:01:40.692 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39)
2024-09-30 13:01:40.693 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 57 (0x39)
2024-09-30 13:01:40.693 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 57 (0x39)

Additional information

No response

mib1185 commented 3 weeks ago

do you have any custom components installed?

richard-c-howlett commented 3 weeks ago

Hi mib1185,

Yes I do. Here's a screengrab of them all. image

mib1185 commented 3 weeks ago

OK, to ensure, the reported error is not caused by any of these custom components, please restart HA in safe mode and check if the issue still occur, thx :+1:

richard-c-howlett commented 3 weeks ago

Hi, I have restarted in Safe mode. As this only occurs randomly it could be days/weeks before it happens again.

At the start of June I wrote a bash script that runs via cron every 15 minutes 'grep'ing for the regex "WARNING.scapy.runtime.Invalid prefix value" in home-assistant.log and if it finds it restarts the HA docker container after saving the log file with a timestamp so I get to retain the log. You can see from this list of saved logs that it went almost a month without occurring but recently has been happening almost every day.

-rw-r--r-- 1 root root    103610 Jun  3 17:20 home-assistant.log.20240603-172040
-rw-r--r-- 1 root root    103610 Jun  3 17:20 home-assistant.log.20240603-172056
-rw-r--r-- 1 root root    158417 Jun  5 02:00 home-assistant.log.20240605-020001
-rw-r--r-- 1 root root    149868 Jun  7 04:45 home-assistant.log.20240607-044501
-rw-r--r-- 1 root root    685544 Jun 19 15:45 home-assistant.log.20240619-154501
-rw-r--r-- 1 root root    130156 Jun 23 16:45 home-assistant.log.20240623-164501
-rw-r--r-- 1 root root    334402 Jun 29 20:45 home-assistant.log.20240629-204502
-rw-r--r-- 1 root root    466649 Jul  4 22:00 home-assistant.log.20240704-220001
-rw-r--r-- 1 root root  17843140 Jul  5 20:00 home-assistant.log.20240705-200001
-rw-r--r-- 1 root root    184761 Jul  8 00:15 home-assistant.log.20240708-001501
-rw-r--r-- 1 root root    383794 Jul 16 17:00 home-assistant.log.20240716-170001
-rw-r--r-- 1 root root    171425 Jul 17 17:45 home-assistant.log.20240717-174502
-rw-r--r-- 1 root root     94814 Jul 18 05:45 home-assistant.log.20240718-054501
-rw-r--r-- 1 root root     93236 Jul 19 20:00 home-assistant.log.20240719-200001
-rw-r--r-- 1 root root    412612 Sep 20 01:30 home-assistant.log.20240920-013001
-rw-r--r-- 1 root root     45559 Sep 20 13:30 home-assistant.log.20240920-133001
-rw-r--r-- 1 root root    137872 Sep 26 14:30 home-assistant.log.20240926-143001
-rw-r--r-- 1 root root    337906 Sep 28 12:30 home-assistant.log.20240928-123001
-rw-r--r-- 1 root root    199384 Sep 29 15:15 home-assistant.log.20240929-151501
-rw-r--r-- 1 root root    446916 Sep 30 13:04 home-assistant.log.20240930-130455

I have looked in these logs to see if there's anything linked to this scapy issue (ie something happening before the scapy warning is logged) but I am not knowledgeable enough to be able to make sense of it. Maybe someone who knows the inner workings might be able to interpret something out of one of them. These look pretty safe as in there's no personal information in them. Want me to attached a log file?

mib1185 commented 3 weeks ago

scapy is also used by the DHCP discovery integration, so it may also be sourced by this, but the messages seems to be related to some invalid dhcp packets 🤔

mib1185 commented 3 weeks ago

how exactly do you run the home assistant container?

mib1185 commented 3 weeks ago

Want me to attached a log file?

yes, i think it is sufficient to attache one of these, thx 👍

richard-c-howlett commented 3 weeks ago

I run it via docker compose. Here's my compose file:


name: homeassistant
services:
  homeassistant:
    container_name: homeassistant
    image: "ghcr.io/home-assistant/home-assistant:latest"
    volumes:
      - /home/homeassistant/live:/config
      - /etc/localtime:/etc/localtime:ro
    restart: unless-stopped
    privileged: true
    network_mode: host```
richard-c-howlett commented 3 weeks ago

The issue has just happened, well at 16:45, whilst in Safe mode. I am attaching the log for this. Thank you for taking the time to look into it by the way. home-assistant.log.20241001-164501.log

PS The comment at the top of your github page made me smile. "..... so before I do a task several times, I automate it". I am exactly the same to the extent that they call me "The Script Guy" where I work!! If it can be scripted I script it!!

richard-c-howlett commented 3 weeks ago

As it does not seem to be custom component related I am going to leave it running in normal mode now

home-assistant[bot] commented 3 weeks ago

Hey there @bdraco, mind taking a look at this issue as it has been labeled with an integration (dhcp) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `dhcp` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign dhcp` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


dhcp documentation dhcp source (message by IssueLinks)

bdraco commented 3 weeks ago

Its likely something on the network sending invalid dhcp packets

mib1185 commented 3 weeks ago

something sends a dhcp package with option 121 (rfc3442), but with an invalid static route definition

richard-c-howlett commented 3 weeks ago

Found these 2 lines in /var/log/messages that match the exact time the 1st scapy message appears in HA log: Oct 1 16:41:35 giles dhcpd: DHCPREQUEST for 192.168.4.148 from 38:81:d7:3c:11:38 via enp3s0 Oct 1 16:41:35 giles dhcpd: DHCPACK on 192.168.4.148 to 38:81:d7:3c:11:38 via enp3s0

192.168.4.148 is a Ring doorbell. I have checked the previous time the scapy error was logged in HA and there's a corresponding pair of lines in /var/log/messages. So it looks like my Ring doorbell is the culprit.

Could a bad packet be handled by the code any better possibly rather than it getting into a loop....?

mib1185 commented 3 weeks ago

The message is already a warning, so decreasing log level would not help. further it seems you are running a dhcpd on the same host, as the ha container? maybe you want to split them to different hosts or check the dhcpd configuration for a possible miss configuration related to dhcp option 121 - but this is out of scope of the HA project :wink:

richard-c-howlett commented 2 weeks ago

Hi mib1185,

Thanks for looking into this. I wasn't suggesting the log level be reduced. But it looks like the code gets stuck in a loop and logs the same group of 3 messages every 5 seconds. Could it log it once and then stop rather than logging it continuously every 5 seconds until I restart HA.

It's worth reiterating that HA is completely unresponsive whilst this is happening which is why I have to restart it to recover.

Here is just 1 minute's worth. As you can see it is logging the same 3 lines every 5 seconds:

2024-10-01 16:41:35.064 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:35.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:35.068 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:40.067 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:40.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:40.068 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:45.067 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:45.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:45.068 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:50.067 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:50.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:50.068 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:55.067 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:55.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:41:55.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:00.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:00.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:00.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:05.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:05.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:05.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:10.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:10.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:10.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:15.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:15.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:15.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:20.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:20.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:20.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:25.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:25.069 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:25.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:30.068 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:30.069 WARNING (MainThread) [scapy.runtime] Invalid prefix value: 106 (0x6a)
2024-10-01 16:42:30.069 WARNING (MainThread) [scapy.runtime] more Invalid prefix value: 106 (0x6a)
mib1185 commented 2 weeks ago

Could it log it once and then stop rather than logging it continuously every 5 seconds until I restart HA.

because these logs are generated by the scapy lib, we can't change its behavior from ha core side.

It's worth reiterating that HA is completely unresponsive whilst this is happening which is why I have to restart it to recover.

sorry. i've overseen this in your initial post. do you see any abnormal system resource consumption when this happens? are there any other of the services on the host, where you run the HA container, effected by this issue?

richard-c-howlett commented 2 weeks ago

Could it log it once and then stop rather than logging it continuously every 5 seconds until I restart HA.

because these logs are generated by the scapy lib, we can't change its behavior from ha core side.

So this is a problem with the scapy library then? Continuous logging of an event would seem to be an issue.

It's worth reiterating that HA is completely unresponsive whilst this is happening which is why I have to restart it to recover.

sorry. i've overseen this in your initial post. do you see any abnormal system resource consumption when this happens? are there any other of the services on the host, where you run the HA container, effected by this issue?

The Linux host is fine, the container running HA is maxxed out (100% cpu) and HA cannot be accessed via mobile app or web browser

mib1185 commented 2 weeks ago

It's most likely not the logging itself which causes the 100% cpu, but something happen inside the lib which exhaust the system/container. As the first step I would recommend to enable debug logging for the dhcp integration and all its dependencies - to do so add the following to your configuration.yaml and restart HA.

logger:
  default: info
  logs:
    homeassistant.components.dhcp: debug
    aiodiscover: debug
    aiodhcpwatcher: debug
    dnspython: debug
    pyroute2: debug
    scapy: debug

As soon as the next event happens and your automation restarts the container, please provide the last logfile, which should now contain debug log messages.

issue-triage-workflows[bot] commented 4 days ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

richard-c-howlett commented 4 days ago

Adding comment to keep issue open. I have made the changes to enable debug logging and am waiting for the issue to occur again. It could happen within a day or it could be several weeks but at some point I will have a log which may be helpful