google-code-export / lusca-cache

Automatically exported from code.google.com/p/lusca-cache
0 stars 0 forks source link

external_acl_type - queue constantly overloading #120

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
- What steps will reproduce the problem?

1. Add some configurations like that in lusca's squid.conf:

----------------------
external_acl_type checa_dominio children=8 ttl=14400 negative_ttl=14400 %URI 
/usr/local/bin/url_regex /usr/local/etc/squid/regexlist.txt

acl webex external checa_dominio

cache deny webex
cache_peer 127.0.0.1 parent 65080 0 proxy-only no-digest no-query
dead_peer_timeout 2 seconds
cache_peer_access 127.0.0.1 allow webex
cache_peer_access 127.0.0.1 deny all
forwarded_for on
----------------------

2. Create /usr/local/bin/url_regex as a shell script with:

#!/bin/sh
while read URI ; do
 if echo ${URI} | egrep -f ${FILE} >/dev/null 2>&1 ; then
 echo OK
 else
 echo ERR
 fi
done

Or a compiled Cpp program:

 while (true) {
   bool found = false;
   string url, urlf;

   getline (cin,url); // Pega a URL pelo STDIN

   ifstream myfile (arquivo);
   if (!myfile.is_open()) {
       cout << "Erro ao abrir o arquivo.\n";
       return 1;
   }

   while (! myfile.eof() ) {
     getline (myfile,urlf); // Pega a URL do conf

     if ((regex_match(urlf, url)) != "") {
    found = true;
       break; // break while anterior
     } else {
       found = false;
     }
   }
   myfile.close(); // fecha o conf no fim do loop

   if (found)
       cout << "OK\n";
   else
       cout << "ERR\n";

 }
}

3. Create /usr/local/etc/squid/regexlist.txt with any sort of regex, I am 
running some simple ones like:

^http.*df\.gov\.br*
^http.*sp\.gov\.br*
^http.*rj\.gov\.br*
^http.*iti\.gov\.com.*
^http.*youtube\.com.*

4. Generate some medium load to the lusca process, check for logs o cache.log 
and you will get a number of queue overload messages.

- What is the expected output? What do you see instead?

No output, only ACL returning true or false. Istead I get a number of:

aclMatchExternal: 'checa_dominio' queue overload.

SOME SAMPLE:

010/08/06 16:43:56| aclMatchExternal: 'checa_dominio' queue overload.
Request rejected
'http://ad-g.doubleclick.net/activity;src=1901600;met=1;v=1;pid=18708550;aid=227
080611;ko=0;cid=37119822;rid=37137700;rv=1;&timestamp=1281113570328;eid1=2;ecn1=
0;etm1=44;'.
2010/08/06 16:43:56| aclMatchExternal: 'checa_dominio' queue overload.
Request rejected
'http://ad-g.doubleclick.net/activity;src=1901600;met=1;v=1;pid=18708550;aid=227
080611;ko=0;cid=37119822;rid=37137700;rv=1;&timestamp=1281113570328;eid1=2;ecn1=
0;etm1=44;'.
2010/08/06 16:43:56| aclMatchExternal: 'checa_dominio' queue overload.
Request rejected
'http://ad-g.doubleclick.net/activity;src=1901600;met=1;v=1;pid=18708550;aid=227
080611;ko=0;cid=37119822;rid=37137700;rv=1;&timestamp=1281113570328;eid1=2;ecn1=
0;etm1=44;'.
2010/08/06 16:43:56| aclMatchExternal: 'checa_dominio' queue overload.
Request rejected
'http://urchin-tracker.bigpoint.net/utm.gif?utmwv=6.1&utmn=1455240777&utmsr=1024
x768&utmsc=32-bit&utmul=pt-br&utmje=1&utmjv=-&utmfl=10.1%2520r53&utmcr=1&utmdt=J
ogos%2520online%253A%2520passe%2520umas%2520f%25E9rias%2520na%2520fazenda%2520Fa
rmerama&utmhn=www.farmerama.bigpoint.com&utmp=/%253FareaID%253Dexternal.home&xc=
aid_fallback%253D1333%253B%2520aid_fallback_info%253D935%

- What version of the product are you using? On what operating system?

Squid Cache: Version LUSCA_HEAD-r14535
configure options:  '--bindir=/usr/local/sbin' '--sbindir=/usr/local/sbin' 
'--datadir=/usr/local/etc/squid' '--libexecdir=/usr/local/libexec/squid' 
'--localstatedir=/usr/local/squid' '--sysconfdir=/usr/local/etc/squid' 
'--enable-removal-policies=lru heap' '--disable-linux-netfilter' 
'--disable-linux-tproxy' '--disable-epoll' '--with-pthreads' 
'--enable-storeio=null aufs coss' '--enable-snmp' '--enable-htcp' 
'--disable-wccp' '--enable-err-languages=English Portuguese' 
'--enable-default-err-language=English' '--prefix=/usr/local' 
'--mandir=/usr/local/man' '--infodir=/usr/local/info/' 
'--build=amd64-portbld-freebsd8.0' 'build_alias=amd64-portbld-freebsd8.0' 
'CC=cc' 'CFLAGS=-O2 -pipe  -fno-strict-aliasing' 'LDFLAGS=' 'CPPFLAGS=' 
'--with-large-files' '--enable-large-cache-files' '--enable-freebsd-tproxy' 
'--enable-follow-x-forwarded-for' '--with-aufs-threads=768' 
'--with-maxfd=65536' '--with-aio' '--disable-ident-lookups'

# uname -a
FreeBSD gateway.farmpop48 8.1-STABLE FreeBSD 8.1-STABLE #0: Wed Jul 28 23:02:17 
UTC 2010     meyer@:/usr/obj/usr/src/sys/CACHE  amd64

- Please provide any additional information below.

Some mgr:info statistics:

Average HTTP requests per minute since start:   32926.4
Select loop called: 12979652 times, 0.151 ms avg

client_http.requests = 597.268927/sec
client_http.hits = 100.306486/sec
client_http.errors = 0.000000/sec
client_http.kbytes_in = 644.418842/sec
client_http.kbytes_out = 8999.370498/sec

server.all.requests = 506.542423/sec
server.all.errors = 0.000000/sec
server.all.kbytes_in = 7060.863981/sec
server.all.kbytes_out = 632.422197/sec

aborted_requests = 6.266655/sec
cpu_time = 116.442433 seconds
wall_time = 300.000539 seconds
cpu_usage = 38.814075%

Cache information for squid:
        Request Hit Ratios:     5min: 15.9%, 60min: 15.7%
        Byte Hit Ratios:        5min: 21.8%, 60min: 21.1%
        Request Memory Hit Ratios:      5min: 30.0%, 60min: 33.2%
        Request Disk Hit Ratios:        5min: 39.3%, 60min: 38.0%
        Storage Swap size:      69273860 KB
        Storage Mem size:       132148 KB
        Mean Object Size:       67.67 KB
        Requests given to unlinkd:      0
Median Service Times (seconds)  5 min    60 min:
        HTTP Requests (All):   0.16775  0.16775
        Cache Misses:          0.19742  0.19742
        Cache Hits:            0.00000  0.00000
        Near Hits:             0.08265  0.08265
Resource usage for squid:
        UP Time:        1957.524 seconds
        CPU Time:       681.994 seconds
        CPU Usage:      34.84%
        CPU Usage, 5 minute avg:        38.27%
        CPU Usage, 60 minute avg:       34.77%
        Process Data Segment Size via sbrk(): 0 KB
        Maximum Resident Size: 596024 KB
        Page faults with physical i/o: 0
Memory accounted for:
        Total accounted:       359223 KB
        memPoolAlloc calls: 154933370
        memPoolFree calls: 151310294
File descriptor usage for squid:
        Maximum number of file descriptors:   29527
        Largest file desc currently in use:   21452
        Number of file desc currently in use: 20791
Available number of file descriptors: 8736
        Reserved number of file descriptors:   100
        Store Disk files open:                 376
        IO loop method:                     kqueue
Internal Data Structures:
        1035404 StoreEntries
         16134 StoreEntries with MemObjects
         14790 Hot Object Cache Items
        1023665 on-disk objects

Original issue reported on code.google.com by dudu.me...@gmail.com on 6 Aug 2010 at 10:01

GoogleCodeExporter commented 9 years ago
Very same issue here with similar helper and also another pretty different 
setup with an ldap-policy external acl helper I just downloaded from somewhere. 
Queue always overloads and the system has to be restarted.

Original comment by eks...@gmail.com on 8 Aug 2010 at 4:03

GoogleCodeExporter commented 9 years ago
Me too \0/ (starred).

I'm in the same trouble, it makes external acl a forbidden feature to me. I use 
this approach to send selective traffic to Thunder Cache parent proxy. I have 
raised ttl and negative_ttl to 1 week (604800) and it helped a lot for a while. 
I was confident it solved the problem, but then the first http virus reached 
the network, Queue Overloaded.

I have also tried a helper which would allow simultaneous check. Failed worse, 
Queue overloading too.

Original comment by florzinh...@gtempaccount.com on 9 Aug 2010 at 1:29

GoogleCodeExporter commented 9 years ago
Ok. The unfortunate problem? The external ACL helper code sets "queue too long" 
as being "more pending requests than number of processes configured."

This likely isn't very good. :-)

I'll investigate it a bit more, but it does look like there's plenty of space 
to improve.

You can try fixing this by bumping up the number of pending helper operations. 
Edit src/external_acl.c ; find externalAclOverload() (around line 499.) Then 
modify:

    return def->helper->stats.queue_size > def->helper->n_running;

To something like:

    return def->helper->stats.queue_size > (def->helper->n_running + 100);

That way it'll only return overload if there's more than 100 pending operations 
+ number of helpers running. I bet you can bump it up a bit more (say to 500? :)

Original comment by adrian.c...@gmail.com on 19 Oct 2010 at 3:38

GoogleCodeExporter commented 9 years ago
Hiya,

Please try updating to r14810 and configuring "external_acl_maxqueue" in 
squid.conf to something. Say, configure 16 ACL helpers and set the maxqueue to 
256.

Original comment by adrian.c...@gmail.com on 21 Oct 2010 at 2:02

GoogleCodeExporter commented 9 years ago
I have set 8 helpers (2*hw.ncpu) and the new feature external_acl_maxqueue to 
100 as per the earlier comment, and it overloaded again very early; I have 'm 
raised to 120, 260, and now it's 300. It didnt overload yet but things are 
getting slow. It seems that requests are taking too long to get processed. I 
will raise helpers to 32, which is 8*hw.ncpu; expect load averages to raise.

Feedback by tomorrow after the "internet rush hour" in GMT-3.

Any suggestions to improve the tests will be configure.

I am using a similar helper as Meyer as posted, but I have hardwritten the 
matches into the code to avoid unecessary loops.

It's a 34Mbit/s link with only 12 expressions to be matched.

Original comment by eks...@gmail.com on 21 Oct 2010 at 6:35

GoogleCodeExporter commented 9 years ago
Hm. The helper code looks sensible enough. It does one read and then dequeues 
as many responses as it can from the helper. I'll go see if it's actually 
enqueueing multiple simultaneous requests to a helper.

Original comment by adrian.c...@gmail.com on 22 Oct 2010 at 12:54

GoogleCodeExporter commented 9 years ago
.. but it looks like it'll only enqueue multiple requests at once if the 
concurrency option isn't enabled. Perhaps that is it?

You'll have to edit your helper to work with that. I'll go do some local tests 
and get back to you.

Original comment by adrian.c...@gmail.com on 22 Oct 2010 at 12:58

GoogleCodeExporter commented 9 years ago
Yes, concurrency is not enabled. Only multiple helpers. I will look for a 
concurrency-ready helper written in any language so I can understand what's the 
difference to rewrite my helper.

Original comment by dudu.me...@gmail.com on 22 Oct 2010 at 11:58

GoogleCodeExporter commented 9 years ago
BTW, eksffa is helping on my own server. One of 'em at least. So we are talking 
about the same environment.

Original comment by dudu.me...@gmail.com on 22 Oct 2010 at 12:02

GoogleCodeExporter commented 9 years ago
good day sir,

how to convert regex.c  to linux bases (gcc)

i try compile on linux fedora filed 

cc regex.c -o regex
regex.c: In function âmainâ:
regex.c:219: error: âstruct statâ has no member named âst_mtimespecâ
regex.c:231: error: âstruct statâ has no member named âst_mtimespecâ
regex.c:232: error: âstruct statâ has no member named âst_mtimespecâ

regex.c only for freebsd based (cc) ??

Original comment by fahmi...@gmail.com on 23 Oct 2010 at 7:56