travisghansen / hass-pfsense

pfSense integration with Home Assistant
193 stars 27 forks source link

XMLRPC calls makes pfSense php-fpm loop crash in 2.7.0 #174

Open JeromeJMI opened 1 year ago

JeromeJMI commented 1 year ago

Hi, Since I migrated to pfSense 2.7.0-RELEASE (amd64) I am facing the issue of php-fsm on pfSense when I activate the integration (Everythings worked fine in pfSense 2.5.6)

Logs of the crash in Status -> System Logs

Time Process PID Message
Sep 27 18:13:58 kernel   pid 52679 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
Sep 27 18:13:58 nginx   2023/09/27 18:13:58 [error] 66335#100273: *162007 upstream prematurely closed connection while reading response header from upstream, client: , server: , request: "POST /xmlrpc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: ""
Sep 27 18:13:28 kernel   pid 52601 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
Sep 27 18:13:28 nginx   2023/09/27 18:13:28 [error] 66335#100273: *162004 upstream prematurely closed connection while reading response header from upstream, client: , server: , request: "POST /xmlrpc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: ""
Sep 27 18:13:10 kernel   pid 19917 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
Sep 27 18:13:10 nginx   2023/09/27 18:13:10 [error] 66335#100273: *162002 upstream prematurely closed connection while reading response header from upstream, client: , server: , request: "POST /xmlrpc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: ""
Sep 27 18:12:58 kernel   pid 19577 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
Error in Home Assistant about firmware refresh

Logger: homeassistant Source: custom_components/pfsense/pypfsense/__init__.py:141 Integration: pfSense (documentation, issues) First occurred: 16:31:29 (1 occurrences) Last logged: 16:31:29 Error doing job: Future exception was never retrieved Traceback (most recent call last): File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 229, in inner response = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 252, in _refresh_firmware_update_info raise err File "/config/custom_components/pfsense/__init__.py", line 245, in _refresh_firmware_update_info self._firmware_update_info = self._client.get_firmware_update_info() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 111, in inner raise err File "/config/custom_components/pfsense/pypfsense/__init__.py", line 108, in inner return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 213, in get_firmware_update_info response = self._exec_php(script) ^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 98, in inner response = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 141, in _exec_php response = self._get_proxy().pfsense.exec_php(script) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1122, in __call__ return self.__send(self.__name, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1464, in __request response = self.__transport.request( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1166, in request return self.single_request(host, handler, request_body, verbose) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1196, in single_request raise ProtocolError( xmlrpc.client.ProtocolError: /xmlrpc.php: 502 Bad Gateway>

Error repeated each actualization

Logger: custom_components.pfsense.pypfsense Source: custom_components/pfsense/pypfsense/__init__.py:110 Integration: pfSense (documentation, issues) First occurred: 16:31:29 (275 occurrences) Last logged: 18:20:58 Unexpected get_firmware_update_info error err=/xmlrpc.php: 502 Bad Gateway>, type(err)= Unexpected get_config error err=/xmlrpc.php: 502 Bad Gateway>, type(err)= Unexpected get_system_info error err=/xmlrpc.php: 502 Bad Gateway>, type(err)=

Last issue (May due to php-fsm crashing)

Logger: custom_components.pfsense Source: custom_components/pfsense/pypfsense/__init__.py:141 Integration: pfSense (documentation, issues) First occurred: 16:31:29 (274 occurrences) Last logged: 18:20:58 Unexpected error fetching Helium pfSense state data: /xmlrpc.php: 502 Bad Gateway> Unexpected error fetching Helium pfSense device tracker state data: /xmlrpc.php: 502 Bad Gateway> Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 290, in _async_refresh self.data = await self._async_update_data() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 246, in _async_update_data return await self.update_method() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 96, in async_update_data await hass.async_add_executor_job(lambda: data.update()) File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 96, in await hass.async_add_executor_job(lambda: data.update()) ^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 531, in update raise err File "/config/custom_components/pfsense/__init__.py", line 337, in update new_state["config"] = self._get_config() ^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 229, in inner response = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/__init__.py", line 268, in _get_config return self._client.get_config() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 111, in inner raise err File "/config/custom_components/pfsense/pypfsense/__init__.py", line 108, in inner return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 310, in get_config response = self._exec_php(script) ^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 98, in inner response = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/pfsense/pypfsense/__init__.py", line 141, in _exec_php response = self._get_proxy().pfsense.exec_php(script) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1122, in __call__ return self.__send(self.__name, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1464, in __request response = self.__transport.request( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1166, in request return self.single_request(host, handler, request_body, verbose) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/xmlrpc/client.py", line 1196, in single_request raise ProtocolError( xmlrpc.client.ProtocolError: /xmlrpc.php: 502 Bad Gateway>

If I check error are continous. I don't recode any info from pfsense apart from the WAN interface status & some devices trackers.

synoniem commented 1 year ago

Probably related because it started with 2.7.0 and after switching to 2023-5-1 it continues:

13:12:12 PHP ERROR: Type: 1, File: /etc/inc/util.inc, Line: 2479, Message: Uncaught TypeError: Unsupported operand types: string / int in /etc/inc/util.inc:2479 Stack trace:

0 /etc/inc/pfsense-utils.inc(2013): get_memory()

1 /usr/local/www/includes/functions.inc.php(104): pfsense_default_state_size()

2 /usr/local/www/includes/functions.inc.php(46): get_pfstate(true)

3 /usr/local/www/getstats.php(40): get_stats(Array)

4 {main}

thrown

When I remove hass-pfsense these messages disappear. It worked flawlessly in 2.6.0.

astergiou commented 1 year ago

I think I also have this proble, I have latest pfsense 2.7 and with this integration pfSense vm has crashed 3 times and needs hard reboot to work again!

some logs from pfSense Oct 16 12:42:23 radiusd 1148 (107) Login OK: [nagiostest] (from client NagiosXI port 0)
Oct 16 12:40:00 sshguard 36531 Now monitoring attacks.
Oct 16 12:40:00 sshguard 53820 Exiting on signal.
Oct 16 12:39:45 check_reload_status 405 Reloading filter
Oct 16 12:39:45 php-fpm 59501 /rc.ipsec: IPSEC: One or more IPsec tunnel gateways have changed. Refreshing.
Oct 16 12:39:30 php-fpm 75419 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use INTERNET_PPPOE.
Oct 16 12:39:30 php-fpm 75419 /rc.openvpn: Gateway, NONE AVAILABLE
Oct 16 12:39:30 php-fpm 75419 /rc.openvpn: Gateway, NONE AVAILABLE
Oct 16 12:39:29 check_reload_status 405 Reloading filter
Oct 16 12:39:29 check_reload_status 405 Restarting OpenVPN tunnels/interfaces
Oct 16 12:39:29 check_reload_status 405 Restarting IPsec tunnels
Oct 16 12:39:29 check_reload_status 405 updating dyndns INTERNET_PPPOE
Oct 16 12:39:29 rc.gateway_alarm 81418 >>> Gateway alarm: INTERNET_PPPOE (Addr:62.169.255.59 Alarm:0 RTT:34.144ms RTTsd:27.325ms Loss:20%)
Oct 16 12:38:54 check_reload_status 405 Reloading filter
Oct 16 12:38:54 php-fpm 368 /rc.ipsec: IPSEC: One or more IPsec tunnel gateways have changed. Refreshing.
Oct 16 12:38:39 php-fpm 60814 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use INTERNET_PPPOE.
Oct 16 12:38:39 php-fpm 60814 /rc.openvpn: Gateway, NONE AVAILABLE
Oct 16 12:38:39 php-fpm 60814 /rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'INTERNET_PPPOE'
Oct 16 12:38:38 check_reload_status 405 Reloading filter
Oct 16 12:38:38 check_reload_status 405 Restarting OpenVPN tunnels/interfaces
Oct 16 12:38:38 check_reload_status 405 Restarting IPsec tunnels
Oct 16 12:38:38 check_reload_status 405 updating dyndns INTERNET_PPPOE
Oct 16 12:38:38 rc.gateway_alarm 89621 >>> Gateway alarm: INTERNET_PPPOE (Addr:62.169.255.59 Alarm:1 RTT:31.122ms RTTsd:24.406ms Loss:21%)
Oct 16 12:38:06 check_reload_status 405 Reloading filter
Oct 16 12:38:06 check_reload_status 405 Syncing firewall
Oct 16 12:38:06 php-fpm 75419 /firewall_shaper_vinterface.php: Configuration Change: astergiou@192.168.249.52 (Local Database): Traffic Shaper: Changes applied
Oct 16 12:37:59 check_reload_status 405 Syncing firewall
Oct 16 12:37:59 php-fpm 367 /firewall_shaper_vinterface.php: Configuration Change: astergiou@192.168.249.52 (Local Database): Traffic Shaper: New pipe added
Oct 16 12:37:27 check_reload_status 405 Reloading filter
Oct 16 12:37:27 php-fpm 368 /rc.ipsec: IPSEC: One or more IPsec tunnel gateways have changed. Refreshing.
Oct 16 12:37:23 radiusd 1148 (106) Login OK: [nagiostest] (from client NagiosXI port 0)
Oct 16 12:37:12 php-fpm 86902 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use INTERNET_PPPOE.
Oct 16 12:37:12 php-fpm 86902 /rc.openvpn: Gateway, NONE AVAILABLE
Oct 16 12:37:12 php-fpm 86902 /rc.openvpn: Gateway, NONE AVAILABLE
Oct 16 12:37:11 check_reload_status 405 Reloading filter
Oct 16 12:37:11 check_reload_status 405 Restarting OpenVPN tunnels/interfaces
Oct 16 12:37:11 check_reload_status 405 Restarting IPsec tunnels
Oct 16 12:37:11 check_reload_status 405 updating dyndns INTERNET_PPPOE
Oct 16 12:37:11 rc.gateway_alarm 76766 >>> Gateway alarm: INTERNET_PPPOE (Addr:62.169.255.59 Alarm:0 RTT:33.876ms RTTsd:27.323ms Loss:17%)
Oct 16 12:37:00 sshguard 53820 Now monitoring attacks.
JeromeJMI commented 1 year ago

Same issues I think, the XMLRPC which is apprently handled by php-fpm is loop crashing. This leads to some issues on pfSense side which needed to be rebooted at some point, in fact it crashes every time the call from hass-pfsense is called. I tried to follow the calls in pfSense but I'm not a php expert.

I also found a new byte of code which is precisely in the getMemory part which have been changed in the 2.7 version, so it can be the issue.

Will try to open a ticket on pfSense part, because I don't know if the integration can do anything.

JeromeJMI commented 1 year ago

Probably related because it started with 2.7.0 and after switching to 2023-5-1 it continues:

13:12:12 PHP ERROR: Type: 1, File: /etc/inc/util.inc, Line: 2479, Message: Uncaught TypeError: Unsupported operand types: string / int in /etc/inc/util.inc:2479 Stack trace: #0 /etc/inc/pfsense-utils.inc(2013): get_memory() #1 /usr/local/www/includes/functions.inc.php(104): pfsense_default_state_size() #2 /usr/local/www/includes/functions.inc.php(46): get_pfstate(true) #3 /usr/local/www/getstats.php(40): get_stats(Array) #4 {main} thrown

When I remove hass-pfsense these messages disappear. It worked flawlessly in 2.6.0.

I checked for the pfSense code and it appears there was a PR around safety-belts around the faulty code the XMLRPC use: https://github.com/pfsense/pfsense/commit/054c25418f28bd0afeb1e4a3f07075db76f8f61b

synoniem commented 1 year ago

On 20-10-2023 09:48, Jay wrote:

Probably related because it started with 2.7.0 and after switching
to 2023-5-1 it continues:

13:12:12 PHP ERROR: Type: 1, File: /etc/inc/util.inc, Line: 2479,
Message: Uncaught TypeError: Unsupported operand types: string /
int in /etc/inc/util.inc:2479 Stack trace: #0
/etc/inc/pfsense-utils.inc(2013): get_memory() #1
<https://github.com/travisghansen/hass-pfsense/issues/1>
/usr/local/www/includes/functions.inc.php(104):
pfsense_default_state_size() #2
<https://github.com/travisghansen/hass-pfsense/issues/2>
/usr/local/www/includes/functions.inc.php(46): get_pfstate(true)
#3 <https://github.com/travisghansen/hass-pfsense/issues/3>
/usr/local/www/getstats.php(40): get_stats(Array) #4
<https://github.com/travisghansen/hass-pfsense/discussions/4>
{main} thrown

When I remove hass-pfsense these messages disappear. It worked
flawlessly in 2.6.0.

I checked for the pfSense code and it appears there was a PR around safety-belts around the faulty code the XMLRPC use: @.*** https://github.com/pfsense/pfsense/commit/054c25418f28bd0afeb1e4a3f07075db76f8f61b

Message ID: @.***>

If I read the code correctly the PR has to do with memory allocation. That could make sense because I changed hardware where pfSense got 8 GB instead of 2 GB. And with 2GB it does pass this safety belt different AFAICS.

jkaberg commented 1 year ago

So the only solution regarding 2.7.0 for now is to wait for an (pfsense) patch (ref issue) or disable the integration?

JeromeJMI commented 1 year ago

I didn't got much time to investigate, so my solution was to disable totally the integration (Which I was only used for presence detection) and migrate to HomeKit stuff. The only thing if you used this is to wait for them to do a fix ... But with pfSense releasing versions every x month I think it will not be out for like 6-12 month

travisghansen commented 1 year ago

Sorry team just getting up to date on this (my time to maintain is limited currently). So this is caused by a bug in pfsense that could happen outside the integration? Anything I can do to mitigate the issue?

synoniem commented 1 year ago

On 13-11-2023 13:14, Travis Glenn Hansen wrote:

Sorry team just getting up to date on this (my time to maintain is limited currently). So this is caused by a bug in pfsense that could happen outside the integration? Anything I can do to mitigate the issue?

— Reply to this email directly, view it on GitHub https://github.com/travisghansen/hass-pfsense/issues/174#issuecomment-1808047909, or unsubscribe https://github.com/notifications/unsubscribe-auth/APID6ZTP6VBKZ7BIYO7C2STYEIFKDAVCNFSM6AAAAAA5JUKLUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBYGA2DOOJQHE. You are receiving this because you commented.Message ID: @.***>

With the last update of pfSense 23.09 or 2.7.01 the error has disappeared. But I cannot put my finger on which part of the update resolved it.

--------------3mLWIc0RpFwwZIKmAHXQaWN0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

On 13-11-2023 13:14, Travis Glenn Hansen wrote:

Sorry team just getting up to date on this (my time to maintain is limited currently). So this is caused by a bug in pfsense that could happen outside the integration? Anything I can do to mitigate the issue?


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: <travisghansen/hass-pfsense/issues/174/1808047909@github.com>

With the last update of pfSense 23.09 or 2.7.01 the error has disappeared. But I cannot put my finger on which part of the update resolved it.

--------------3mLWIc0RpFwwZIKmAHXQaWN0--