opnsense / plugins

OPNsense plugin collection
https://opnsense.org/
BSD 2-Clause "Simplified" License
844 stars 637 forks source link

Randomly Occuring Squid / C-ICAP Protocol Error after Version 24.1.4 #3875

Closed dblanque closed 2 months ago

dblanque commented 7 months ago

Important notices Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug A clear and concise description of what the bug is, including last known working version (if any).

This issue may possibly share some sort of cause with issue #3827.

The Squid Proxy Server returns an ICAP Protocol Error when -presumably- ICAP or ClamAV randomly restart/reload (maybe due to Freshclam Signature Updates?) and does not reload successfully due to a segmentation error.

Unlike the segmentation error (#3827), this started happening only after version 24.1.4 and was not present prior to that patch.

I haven't found any valuable data on the logs.

_Tip: to validate your setup was working with the previous version, use opnsense-revert (https://docs.opnsense.org/manual/opnsense_tools.html#opnsense-revert)_ - Already tried this.

To Reproduce Steps to MANUALLY reproduce the behavior:

  1. Setup Squid Web-Proxy with ClamAV/C-ICAP
  2. Check that it loads properly (e.g.: go to Google)
  3. Restart the C-ICAP Service.
  4. Get the Protocol Error

Expected behavior A clear and concise description of what you expected to happen.

The service should not randomly kill itself or die.

Screenshots image

Relevant log files If applicable, information from log files supporting your claim. → Couldn't find any that would be useful/related to the problem.

Additional context Add any other context about the problem here.

Environment Software version used and hardware type if relevant. OPNsense 24.1.4-amd64 Virtualized on Proxmox VE Server 8.0.5 Cluster with HA x86-64-AES-v2 vCPU Type (Physical CPU on each node: AMD Ryzen 5 4600G) 4 vCPUs 8192MB RAM

AndyX90 commented 7 months ago

I had the same problem. It looks like opnsense-revert -r 24.1.3 squid solved the problem temporary.

dblanque commented 7 months ago

I had the same problem. It looks like opnsense-revert -r 24.1.3 squid solved the problem temporary.

I had already tried this but I will try this again! Thank you Andy, will update on results.

dblanque commented 7 months ago

The revert has not fixed my problem at all. The problem is not appearing on any logs which is highly annoying... The only issue at the time of the error I could find on the logs is:

kid1| essential ICAP service is up: icap://[::1]:1344/avscan [up]

Edit: The Revert DID fix the problem after doing a restart! Thank you Andy, at least we have this temporarily working again without runtime hiccups.

dblanque commented 4 months ago

@fichtner this seems to be occurring again after updating to 24.1.8. I'll probably be performing the revert as before but wanted to report this continues to happen.

AndyX90 commented 4 months ago

Sadly this also happens on my side.I have to revert squid after every update to 24.1.3.But its not in a VM, it is on a HA-System (2 x DEC-4040).

fichtner commented 4 months ago

You probably have more luck upstream. We have no community support capacity for this.

Cheers, Franco

dblanque commented 4 months ago

Yeah that makes sense, is there anyone in particular from Squid or ClamAV that we could tag here so as to associate the issue somehow?

Thanks, Dylan

AndyX90 commented 2 months ago

Hi @dblanque! Can you test the pull with opnsense-patch -c plugins c02f7a7 and test both values? It seems that setting this option mutes the error on my side..

EDIT: It is not the solution. Sorry for making noise..

dblanque commented 2 months ago

Hi @dblanque! Can you test the pull with opnsense-patch -c plugins c02f7a7 and test both values? It seems that setting this option mutes the error on my side..

EDIT: It is not the solution. Sorry for making noise..

Hi @AndyX90, no worries. I ended up working around the issue with the following solution:

$~: nano /usr/local/etc/squid/pre-auth/debug.conf

# Add this onto the file #
# Uncomment line below to enable ICAP Related Debug
# debug_options ALL,1 93,7

# Disable/Increase ICAP Service Failure Limit
# If ICAP fails N times in 5 seconds, suspend service
# Disabled on Negative Value
icap_service_failure_limit -1

That will disable the ICAP failure limit for Squid and the service suspension trigger will cease to create problems.

Sources and valuable info:

AndyX90 commented 2 months ago

@dblanque Thank You, will test that approach!

AndyX90 commented 2 months ago

After a couple days of testing, this setting really solved the error! A colleague also contacted me about this error, he had many similar cases (>10 opnsense installations). @fichtner: As this setting seems to be also used by other commercial solutions, maybe we should include it by default? For example:

fichtner commented 2 months ago

Which one now? icap_persistent_connections on or icap_service_failure_limit -1

Don't mind stuffing either in the default configuration for when ICAP is enabled, but it should be the one that does the trick.

AndyX90 commented 2 months ago

icap_service_failure_limit -1 is the key.

fichtner commented 2 months ago

Ok, can you raise a new PR and put it right under icap_enable on?

Thanks, Franco