FreeRADIUS / freeradius-server

FreeRADIUS - A multi-protocol policy server.
http://freeradius.org
GNU General Public License v2.0
2.09k stars 1.07k forks source link

random segfaults when originating a coa #2304

Closed linuxchips closed 5 years ago

linuxchips commented 6 years ago

Issue type

I have been getting random segfaults when originating a coa inside accounting section (tried putting it in preacct, same issue). I was running v3.0.13 from centos repo (i usually rebuild the srpm to add a couple of modules and a custom module), I tried the 3.0.17, and the latest 3.0.x (a few hours ago) with the same results.

Defect

How to reproduce the issue

not sure if this would be reproducible by a trivial config, but my current config is simple. it goes something like this:

accounting {
...
...
    if(&Acct-Status-Type != Interim-Update) {
       update control {
           Proxy-To-Realm := "some_realm"
       }
    }
    else {
...
        update coa {
            ....
        }
...
    }
}

omitted lines mainly contain queries to database and some conditions to determine what goes in the attributes of the coa. i am only proxying non interim updates to my old freeradius v2.

the back trace on the core dump shows this

(gdb) bt
#0  coa_retransmit (request=0x7f24b004d490) at src/main/process.c:4351
#1  0x0000559e293bc723 in coa_separate (request=<optimized out>) at src/main/process.c:4533
#2  0x0000559e293bf4b1 in request_running (request=0x559e2cb829b0, action=<optimized out>) at src/main/process.c:1550
#3  0x00007f24e9c64494 in fr_event_run (el=el@entry=0x559e2bb96190, when=when@entry=0x7ffd8bbb04c0) at src/lib/event.c:309
#4  0x00007f24e9c64aa1 in fr_event_loop (el=0x559e2bb96190) at src/lib/event.c:632
#5  0x0000559e293c0b21 in radius_event_process () at src/main/process.c:5737
#6  0x0000559e29399a1f in main (argc=<optimized out>, argv=<optimized out>) at src/main/radiusd.c:587
(gdb) p *request
$1 = {number = 1731, timestamp = 1537097668, data = 0x0, listener = 0x559e2c589bb0, client = 0x559e2bebb390, packet = 0x7f24b008f080, username = 0x0, password = 0x0, reply = 0x7f24b0046420, config = 0x0,
  state_ctx = 0x7f24b004b0c0, state = 0x0, proxy_listener = 0x0, proxy = 0x7f24b0046520, proxy_reply = 0x0, home_server = 0x0, home_pool = 0x0, process = 0x0, response_delay = {tv_sec = 0, tv_usec = 0},
  timer_action = FR_ACTION_INVALID, ev = 0x0, handle = 0x0, rcode = RLM_MODULE_REJECT, module = 0x7f24e9e9f9ed "", component = 0x7f24e9ea3cf6 "<core>", delay = 0, master_state = REQUEST_ACTIVE,
  child_state = REQUEST_RUNNING, child_pid = 139793417443072, root = 0x559e295f6100 <main_config>, simul_max = 0, simul_count = 0, simul_mpp = 0, priority = RAD_LISTEN_NONE, in_request_hash = false, in_proxy_hash = false,
  num_proxied_requests = 0, num_proxied_responses = 0, server = 0x559e2b6aa840 "ELProxy", parent = 0x0, log = {func = 0x7f24e9e8d160 <vradlog_request>, lvl = L_DBG_LVL_2, indent = 0 '\000'}, options = 1, coa = 0x0,
  num_coa_requests = 0}
(gdb)

this is from the latest 3.0.x, the other versions have the same results (with line numbers different :))

i can see the problem, home_server is null. but i can not figure out why, there is no warnings and no errors on my debug output! the only thing i noticed on the debug is that the log is not complete. i.e. the log is in the middle of the "accounting" section (after update coa by a few commands) and then it crashes.

the server was run by radiusd -xxfl stdout to produce the core dumps and logs.

this happens only under load, not heavy load though, i.e. i can do this from my pc terminal (the file have a packet that would produce a coa, i.e. an interim update) $ while true; do radclient -x -f dbgRadCrash $server:$port acct $secret -r 1; done and after a few seconds the server crashes!

if I comment the "update coa" section it becomes stable. if i rebuild with --enable-developer i get assert fail in coa_separate in line 4525 on 3.0.x builds rad_assert(request->proxy_reply || request->proxy_listener);

please let me know if you need additional info, and what I can do more to debug this issue. I am willing to patch it and/or test a patch if i could. and/or any help i can give.

ps: i am still using the rpmbuild to make the packages, as currently I can only test on one of our production servers, while preparing for test server. would that affect the use of --enable-developer flag? as my core dumps still showing <optimized out> on some places when using it.

alandekok commented 6 years ago

The gdb backtrace helped enormously. I think that the patch should fix it. If so, please close the issue. The fix will be in 3.0.18.

linuxchips commented 6 years ago

hmmm, still crashing. but in here

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000564b7af890d4 in coa_no_reply (request=0x7f5cec01d000, action=<optimized out>) at src/main/process.c:4584
#2  0x0000564b7af8514c in request_handler_thread (arg=0x564b7e142310) at src/main/threads.c:826
#3  0x00007f5d45ffde25 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f5d458aabad in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x0000564b7af890d4 in coa_no_reply (request=0x7f5cec01d000, action=<optimized out>) at src/main/process.c:4584
4584                            request->handle(request);
(gdb) p *request
$1 = {number = 735, timestamp = 1537141181, data = 0x0, listener = 0x564b7e039a20, client = 0x564b7e0e43a0, packet = 0x7f5cec01e720, username = 0x0, password = 0x0, reply = 0x7f5cec01e820, config = 0x564b7e2ae3c0, 
  state_ctx = 0x7f5cec001a80, state = 0x0, proxy_listener = 0x0, proxy = 0x7f5cec01e920, proxy_reply = 0x0, home_server = 0x0, home_pool = 0x0, process = 0x564b7af88fc0 <coa_no_reply>, response_delay = {tv_sec = 0, 
    tv_usec = 0}, timer_action = FR_ACTION_TIMER, ev = 0x564b7e13d340, handle = 0x0, rcode = RLM_MODULE_REJECT, module = 0x564b7afa65fe "", component = 0x564b7afa366c "<core>", delay = 499999, master_state = REQUEST_ACTIVE, 
  child_state = REQUEST_RUNNING, child_pid = 140035135014656, root = 0x564b7b1c3100 <main_config>, simul_max = 0, simul_count = 0, simul_mpp = 0, priority = RAD_LISTEN_NONE, in_request_hash = false, in_proxy_hash = false, 
  num_proxied_requests = 0, num_proxied_responses = 0, server = 0x564b7d15b840 "ELProxy", parent = 0x0, log = {func = 0x7f5d477ed160 <vradlog_request>, lvl = L_DBG_LVL_2, indent = 0 '\000'}, options = 1, coa = 0x0, 
  num_coa_requests = 0}
(gdb) 
linuxchips commented 6 years ago

ok, i just got the idea of capturing the traffic. sometimes the server is retransmitting an already answered coa packet!! not coinciding with the crashes though, but i thought that might indicate that something is wrong with the retransmit mechanism. or not?

linuxchips commented 6 years ago

well, i added this

diff -ruN a/src/main/process.c b/src/main/process.c
--- a/src/main/process.c        2018-09-17 03:52:12.150203910 +0300
+++ b/src/main/process.c        2018-09-17 03:52:50.757202487 +0300
@@ -4580,7 +4580,7 @@
                break;

        case FR_ACTION_RUN:
-               if (process_proxy_reply(request, NULL)) {
+               if (process_proxy_reply(request, NULL) && request->handle) {
                        request->handle(request);
                }
                request_done(request, FR_ACTION_DONE);

seams stable. been running for 10 min, which is a record tbh. not sure what handle would do if it was to be called!?

alandekok commented 6 years ago

The issue is that request->handle should never be NULL. If it is, it means that something else is going wrong.

If the server is retransmitting an already received CoA packet, then that may be the underlying cause. I'll take a look at pushing a fix.

alandekok commented 6 years ago

I've pushed some more fixes which should help with both the crash and the retransmission issue

linuxchips commented 6 years ago

thanks, i ll test them in a few minutes.

linuxchips commented 6 years ago

same thing

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000055eda6e700e4 in coa_no_reply (request=0x7f2d380280d0, action=<optimized out>) at src/main/process.c:4582
#2  0x000055eda6e6c14c in request_handler_thread (arg=0x55eda93e0850) at src/main/threads.c:826
#3  0x00007f2dbb531e25 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f2dbaddebad in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x000055eda6e700e4 in coa_no_reply (request=0x7f2d380280d0, action=<optimized out>) at src/main/process.c:4582
4582                            request->handle(request);
(gdb) p *request
$1 = {number = 1379, timestamp = 1537192905, data = 0x0, listener = 0x55eda926cc70, client = 0x55eda92bb320, packet = 0x7f2d3801cb40, username = 0x0
  state_ctx = 0x7f2d3800c2a0, state = 0x0, proxy_listener = 0x0, proxy = 0x7f2d38003770, proxy_reply = 0x0, home_server = 0x0, home_pool = 0x0, proc
    tv_usec = 0}, timer_action = FR_ACTION_TIMER, ev = 0x55eda976b7f0, handle = 0x0, rcode = RLM_MODULE_REJECT, module = 0x55eda6e8d63e "", componen
  child_state = REQUEST_RUNNING, child_pid = 139831443089152, root = 0x55eda70aa100 <main_config>, simul_max = 0, simul_count = 0, simul_mpp = 0, pr
  num_proxied_requests = 0, num_proxied_responses = 0, server = 0x55eda8392840 "ELProxy", parent = 0x0, log = {func = 0x7f2dbcd21160 <vradlog_reques
  num_coa_requests = 0}
(gdb) 
alandekok commented 6 years ago

Hmm... if you could do some more tests, that would help a lot:

#define DEBUG_STATE_MACHINE (1)

That will let me know what's going on inside of the server, so I know why it's doing the wrong thing.

linuxchips commented 6 years ago

hmmm... well, i added the line, the server is not crashing! but seams stuck!

linuxchips commented 6 years ago

i rerun it, did not get stuck, but still double transmitting the coa

alandekok commented 6 years ago

And the debug log shows what? I can't fix a problem when the description is "stuff goes wrong".

Please re-build and run as described in my previous comment. Nothing else will help me fix the problem.

linuxchips commented 6 years ago

i agree, i rebuilt and run as you said. its just I am not sure what to do next as it stopped crashing, and as far as the log is concerned, nothing is wrong... but the server is not seeing all the requests that reaches it, i.e. lots of packets i can see in tcpdump, but not in the logs. the logs are huge, few hundreds MBs... should i attach them here? and i also need to remove the passwords and other stuff from it before submitting it.

alandekok commented 6 years ago

If packets are being dropped, it's because the server is too slow to process them. The back-end database may be slow. Or, you may need to increase the size of the receive buffer.

See sites-available/default in the latest git version, and look for recv_buffer. Uncomment it, and change it to a large value. That may help.

linuxchips commented 6 years ago

it was the -s in the -X that caused the slow response. i just used -xxfl somefile. the backtrace is the same

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000055603282d4af in coa_no_reply (request=0x7f7ccc0c54e0, action=<optimized out>) at src/main/process.c:4582
#2  0x000055603282924c in request_handler_thread (arg=0x55603546e740) at src/main/threads.c:826
#3  0x00007f7d27dc0e25 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f7d2766dbad in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x000055603282d4af in coa_no_reply (request=0x7f7ccc0c54e0, action=<optimized out>) at src/main/process.c:4582
4582                            request->handle(request);
(gdb) p *request
$1 = {number = 10524, timestamp = 1537209807, data = 0x0, listener = 0x55603502b910, client = 0x556035275690, packet = 0x7f7ccc0c5780, username = 0x0, password = 0x0, reply = 0x7f7ccc0c5880, config = 0x556036b073e0, 
  state_ctx = 0x7f7ccc0c5690, state = 0x0, proxy_listener = 0x0, proxy = 0x7f7ccc0c5980, proxy_reply = 0x0, home_server = 0x0, home_pool = 0x0, process = 0x55603282d390 <coa_no_reply>, response_delay = {tv_sec = 0, 
    tv_usec = 0}, timer_action = FR_ACTION_TIMER, ev = 0x556036dfb8d0, handle = 0x0, rcode = RLM_MODULE_REJECT, module = 0x55603284b5be "", component = 0x55603284862c "<core>", delay = 499999, master_state = REQUEST_ACTIVE, 
  child_state = REQUEST_RUNNING, child_pid = 140172549052160, root = 0x556032a68100 <main_config>, simul_max = 0, simul_count = 0, simul_mpp = 0, priority = RAD_LISTEN_NONE, in_request_hash = false, in_proxy_hash = false, 
  num_proxied_requests = 0, num_proxied_responses = 0, server = 0x556034150a90 "ELProxy", parent = 0x0, log = {func = 0x7f7d295b0160 <vradlog_request>, lvl = L_DBG_LVL_2, indent = 0 '\000'}, options = 1, coa = 0x0, 
  num_coa_requests = 0}
(gdb) 

i have the log file, its 124M and github would not allow that big of an attachment. here is a link if that may help https://drive.google.com/file/d/1D269gv11rI6pf54LvGpnf7-sNK2Ib_el/view?usp=sharing

alandekok commented 6 years ago

I've pushed something which may help. At this point, it's difficult to tell what's going on.

The debug output helps, but it doesn't make it clear where the error is.

linuxchips commented 6 years ago

ok, the server is stable. i can not see any more double transmit. but i still see failed coa without actual packet being transmitted. i.e. i get "Post-Proxy-Type Fail-CoA" in the log without prior "Sent CoA-Request" message, and no packet in tcpdump. now, how can i be more of a help? mean while, i ll try to insert some more debug messages in the code, if you have an idea of where i would start it would be good. another thing, any particular system resources running out may cause this?

alandekok commented 6 years ago

That helps, thanks. I'll take a look at the Post-Proxy-Type Fail-CoA error tomorrow. Hopefully it's a simple logic bug which can be fixed.

alandekok commented 6 years ago

The issue seems to be that the accounting packet reaches max_request_time and is cancelled. When that happens, it runs Post-Proxy-Type Fail-CoA. The solution is to just cancel the CoA packet, as it was never sent anywhere. I've pushed a fix.

linuxchips commented 6 years ago

i do not think that this is what is happening here. the fail message is in the same second the packet is received. and after the fail message the processing of the packet continues. see the following packet for an example

(1651522) Tue Sep 18 16:23:11 2018: Debug: Received Accounting-Request Id 97 from 10.237.226.214:1646 to 192.168.248.255:1913 length 601
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Session-Id = "0/0/3/246.411_0A01FFD3012B4FA2"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-Account-Info = "Vinfo[userindex:14181357,accounttype:55,expiration:1538416560,]"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-Policy-Up = "_Light_Plus_IN"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-Policy-Down = "_Light_Plus_OUT"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Framed-Protocol = PPP
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Framed-IP-Address = 100.120.81.201
(1651522) Tue Sep 18 16:23:11 2018: Debug:   User-Name = "for18@nstr51"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "connect-progress=LAN Ses Up"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "nas-tx-speed=1000000000"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "nas-tx-speed-kbps=1000000"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "nas-rx-speed=1000000000"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "nas-rx-speed-kbps=1000000"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Session-Time = 3838
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Input-Octets = 8725831
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Output-Octets = 85695307
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Input-Packets = 42055
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Output-Packets = 69636
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Authentic = RADIUS
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Status-Type = Interim-Update
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-Service-Info = "NOPNGDNS"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-Service-Info = "NRDRCTDNS"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Calling-Station-Id = "00-27-22-ea-d2-c9"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   NAS-Port-Type = PPPoEoQinQ
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-NAS-Port = "0/0/3/246.411"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   NAS-Port = 4294922146
(1651522) Tue Sep 18 16:23:11 2018: Debug:   NAS-Port-Id = "0/0/3/246.411"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Cisco-AVPair = "client-mac-address=0027.22ea.d2c9"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Class = 0x656c3a3265303830643638656664306134323161383239626337656230303636376561
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Service-Type = Framed-User
(1651522) Tue Sep 18 16:23:11 2018: Debug:   NAS-IP-Address = 10.237.226.214
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Event-Timestamp = "Sep 18 2018 16:22:57 +03"
(1651522) Tue Sep 18 16:23:11 2018: Debug:   Acct-Delay-Time = 10
(1651522) Tue Sep 18 16:23:11 2018: Debug: # Executing section preacct from file /etc/raddb/sites-enabled/ELProxy
(1651522) Tue Sep 18 16:23:11 2018: Debug:   preacct {
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if ( !(User-Name =~ /(.*)@EL(?i)team$/) && &Acct-Status-Type != Interim-Update) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if ( !(User-Name =~ /(.*)@EL(?i)team$/) && &Acct-Status-Type != Interim-Update)  -> FALSE
(1651522) Tue Sep 18 16:23:11 2018: Debug:     else {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if ( "%{NAS-Port-Id}" =~ /([0-9]+)\/([0-9]+)\/([0-9]+)\/([0-9]*\.?[0-9]*)/ ) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       EXPAND %{NAS-Port-Id}
(1651522) Tue Sep 18 16:23:11 2018: Debug:          --> 0/0/3/246.411
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if ( "%{NAS-Port-Id}" =~ /([0-9]+)\/([0-9]+)\/([0-9]+)\/([0-9]*\.?[0-9]*)/ )  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if ( "%{NAS-Port-Id}" =~ /([0-9]+)\/([0-9]+)\/([0-9]+)\/([0-9]*\.?[0-9]*)/ )  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         update request {
(1651522) Tue Sep 18 16:23:11 2018: Debug:           EXPAND %{1}%{2}%{3}%{4}
(1651522) Tue Sep 18 16:23:11 2018: Debug:              --> 003246.411
(1651522) Tue Sep 18 16:23:11 2018: Debug:         } # update request = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # if ( "%{NAS-Port-Id}" =~ /([0-9]+)\/([0-9]+)\/([0-9]+)\/([0-9]*\.?[0-9]*)/ )  = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (!(&Framed-IP-Address < 10.2.0.0/15)) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (!(&Framed-IP-Address < 10.2.0.0/15))  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (!(&Framed-IP-Address < 10.2.0.0/15))  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         if ( "%{Cisco-Policy-Down}" =~ /([_A-Za-z]*)_OUT(_[0-9]+){0,1}/ ) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{Cisco-Policy-Down}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> _Light_Plus_OUT
(1651522) Tue Sep 18 16:23:11 2018: Debug:         if ( "%{Cisco-Policy-Down}" =~ /([_A-Za-z]*)_OUT(_[0-9]+){0,1}/ )  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:         if ( "%{Cisco-Policy-Down}" =~ /([_A-Za-z]*)_OUT(_[0-9]+){0,1}/ )  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:           update control {
(1651522) Tue Sep 18 16:23:11 2018: Debug:             EXPAND %{1}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                --> _Light_Plus
(1651522) Tue Sep 18 16:23:11 2018: Debug:             EXPAND %{2}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                --> 
(1651522) Tue Sep 18 16:23:11 2018: Debug:             EXPAND %{User-Name}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                --> for18@nstr51
(1651522) Tue Sep 18 16:23:11 2018: Debug:             SQL-User-Name set to 'for18@nstr51'
(1651522) Tue Sep 18 16:23:11 2018: Debug:             Executing select query: select QueueIdentifier from QueueVlans where VlanId='003246.411' limit 1 
(1651522) Tue Sep 18 16:23:11 2018: Debug:             EXPAND %{sql_Qid:select QueueIdentifier from QueueVlans where VlanId='%{Tmp-String-4}' limit 1 }
(1651522) Tue Sep 18 16:23:11 2018: Debug:                --> 21
(1651522) Tue Sep 18 16:23:11 2018: Debug:           } # update control = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:           if ( &control:Tmp-String-2 != "" && !( (&control:Tmp-String-2 == "0") && &control:Tmp-String-1 == "") && (&control:Tmp-String-1 != "_%{control:Tmp-String-2}") ) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:           EXPAND _%{control:Tmp-String-2}
(1651522) Tue Sep 18 16:23:11 2018: Debug:              --> _21
(1651522) Tue Sep 18 16:23:11 2018: Debug:           if ( &control:Tmp-String-2 != "" && !( (&control:Tmp-String-2 == "0") && &control:Tmp-String-1 == "") && (&control:Tmp-String-1 != "_%{control:Tmp-String-2}") )  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:           if ( &control:Tmp-String-2 != "" && !( (&control:Tmp-String-2 == "0") && &control:Tmp-String-1 == "") && (&control:Tmp-String-1 != "_%{control:Tmp-String-2}") )  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:             if ( &control:Tmp-String-2 == "0" ) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:             if ( &control:Tmp-String-2 == "0" )  -> FALSE
(1651522) Tue Sep 18 16:23:11 2018: Debug:             else {
(1651522) Tue Sep 18 16:23:11 2018: Debug:               update control {
(1651522) Tue Sep 18 16:23:11 2018: Debug:                 EXPAND _%{control:Tmp-String-2}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                    --> _21
(1651522) Tue Sep 18 16:23:11 2018: Debug:               } # update control = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:             } # else = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:             update coa {
(1651522) Tue Sep 18 16:23:11 2018: Debug:               EXPAND S%{Framed-IP-Address}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                  --> S100.120.81.201
(1651522) Tue Sep 18 16:23:11 2018: Debug:               EXPAND %{NAS-IP-Address}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                  --> 10.237.226.214
(1651522) Tue Sep 18 16:23:11 2018: Debug:               EXPAND subscriber:sub-qos-policy-in=%{control:Tmp-String-0}_IN%{control:Tmp-String-3}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                  --> subscriber:sub-qos-policy-in=_Light_Plus_IN_21
(1651522) Tue Sep 18 16:23:11 2018: Debug:               EXPAND subscriber:sub-qos-policy-out=%{control:Tmp-String-0}_OUT%{control:Tmp-String-3}
(1651522) Tue Sep 18 16:23:11 2018: Debug:                  --> subscriber:sub-qos-policy-out=_Light_Plus_OUT_21
(1651522) Tue Sep 18 16:23:11 2018: Debug:             } # update coa = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:           } # if ( &control:Tmp-String-2 != "" && !( (&control:Tmp-String-2 == "0") && &control:Tmp-String-1 == "") && (&control:Tmp-String-1 != "_%{control:Tmp-String-2}") )  = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:         } # if ( "%{Cisco-Policy-Down}" =~ /([_A-Za-z]*)_OUT(_[0-9]+){0,1}/ )  = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # if (!(&Framed-IP-Address < 10.2.0.0/15))  = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:     } # else = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:     policy acct_counters64.preacct {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       update request {
(1651522) Tue Sep 18 16:23:11 2018: WARNING:         Can't find &Acct-Input-Gigawords.  Using 0 as operand value
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{expr:(&Acct-Input-Gigawords << 32) | &Acct-Input-Octets}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> 8725831
(1651522) Tue Sep 18 16:23:11 2018: WARNING:         Can't find &Acct-Output-Gigawords.  Using 0 as operand value
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{expr:(&Acct-Output-Gigawords << 32) | &Acct-Output-Octets}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> 85695307
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # update request = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:     } # policy acct_counters64.preacct = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:     policy acct_unique {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       update request {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # update request = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (("%{hex:&Class}" =~ /^%{hex:&Tmp-String-9}/) &&        ("%{string:&Class}" =~ /^el:([0-9a-f]{32})/i)) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       EXPAND %{hex:&Class}
(1651522) Tue Sep 18 16:23:11 2018: Debug:          --> 656c3a3265303830643638656664306134323161383239626337656230303636376561
(1651522) Tue Sep 18 16:23:11 2018: Debug:       EXPAND ^%{hex:&Tmp-String-9}
(1651522) Tue Sep 18 16:23:11 2018: Debug:          --> ^656c3a
(1651522) Tue Sep 18 16:23:11 2018: Debug:       EXPAND %{string:&Class}
(1651522) Tue Sep 18 16:23:11 2018: Debug:          --> el:2e080d68efd0a421a829bc7eb00667ea
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (("%{hex:&Class}" =~ /^%{hex:&Tmp-String-9}/) &&        ("%{string:&Class}" =~ /^el:([0-9a-f]{32})/i))  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (("%{hex:&Class}" =~ /^%{hex:&Tmp-String-9}/) &&        ("%{string:&Class}" =~ /^el:([0-9a-f]{32})/i))  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         update request {
(1651522) Tue Sep 18 16:23:11 2018: Debug:           EXPAND %{1}
(1651522) Tue Sep 18 16:23:11 2018: Debug:              --> 2e080d68efd0a421a829bc7eb00667ea
(1651522) Tue Sep 18 16:23:11 2018: Debug:         } # update request = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # if (("%{hex:&Class}" =~ /^%{hex:&Tmp-String-9}/) &&            ("%{string:&Class}" =~ /^el:([0-9a-f]{32})/i))  = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       ... skipping else: Preceding "if" was taken
(1651522) Tue Sep 18 16:23:11 2018: Debug:     } # policy acct_unique = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:   } # preacct = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug: # Executing section accounting from file /etc/raddb/sites-enabled/ELProxy
(1651522) Tue Sep 18 16:23:11 2018: Debug:   accounting {
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if (&Acct-Status-Type == Accounting-On || &Acct-Status-Type == Accounting-Off) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if (&Acct-Status-Type == Accounting-On || &Acct-Status-Type == Accounting-Off)  -> FALSE
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ ) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  -> TRUE
(1651522) Tue Sep 18 16:23:11 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       update request {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{2}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> 55
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # update request = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       update control {
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{1}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> 14181357
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{3}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> 1538416560
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{User-Name}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> for18@nstr51
(1651522) Tue Sep 18 16:23:11 2018: Debug:         SQL-User-Name set to 'for18@nstr51'
(1651522) Tue Sep 18 16:23:11 2018: Debug:         Executing select query:  select af.AffiliateName from UserDetails ud inner join Affiliates af on ud.AffiliateIndex = af.AffiliateIndex where ud.UserIndex=14181357
(1651522) Tue Sep 18 16:23:11 2018: Debug:         EXPAND %{%{mssql_rbs: select af.AffiliateName from UserDetails ud inner join Affiliates af on ud.AffiliateIndex = af.AffiliateIndex where ud.UserIndex=%{1}}:-none}
(1651522) Tue Sep 18 16:23:11 2018: Debug:            --> nstr16
(1651522) Tue Sep 18 16:23:11 2018: Debug:       } # update control = noop
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (&request:Tmp-String-0 == "2" || User-Name =~ /(.*)@EL(?i)team$/) {
(1651522) Tue Sep 18 16:23:11 2018: Debug:       if (&request:Tmp-String-0 == "2" || User-Name =~ /(.*)@EL(?i)team$/)  -> FALSE
(1651522) Tue Sep 18 16:23:11 2018: Debug:       else {
(1651522) Tue Sep 18 16:23:11 2018: Debug: cache_AccountName: EXPAND %{request:Tmp-String-0}
(1651522) Tue Sep 18 16:23:11 2018: Debug: cache_AccountName:    --> 55
(1651522) Tue Sep 18 16:23:11 2018: Debug: Clearing existing &reply: attributes
(1651522) Tue Sep 18 16:23:11 2018: Debug: Found Post-Proxy-Type Fail-CoA
(1651522) Tue Sep 18 16:23:11 2018: Debug: Post-Proxy-Type sub-section not found.  Ignoring.
(1651522) Tue Sep 18 16:23:11 2018: Debug: # Executing group from file /etc/raddb/sites-enabled/ELProxy
(1651522) Tue Sep 18 16:23:11 2018: Debug: cache_AccountName: Found entry for "55"
(1651522) Tue Sep 18 16:23:12 2018: Debug: cache_AccountName: Merging cache entry into request
(1651522) Tue Sep 18 16:23:12 2018: Debug: cache_AccountName:   &control:Tmp-String-0 := "Light +"
(1651522) Tue Sep 18 16:23:12 2018: Debug:         [cache_AccountName] = ok
(1651522) Tue Sep 18 16:23:12 2018: Debug:       } # else = ok
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: Expanding URI components
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: EXPAND http://10.1.5.70:8123
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:    --> http://10.1.5.70:8123
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: EXPAND /?query=INSERT%%20INTO%%20accounting.AcctPPPOE_buff%%20(NASID,AcctDate,UserId,UserIndex,AcctSessionTime,AcctInputLong,AcctOutputLong,AccountName,CallerId,CalledStationId,IsExpired,FramedIpAddress,InPolicy,OutPolicy,AcctSessionId,AcctStatusType,AcctTerminateCause,Expiration,AffilateName,StateName,EventDate)%%20VALUES%%20
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:    --> /?query=INSERT%20INTO%20accounting.AcctPPPOE_buff%20(NASID,AcctDate,UserId,UserIndex,AcctSessionTime,AcctInputLong,AcctOutputLong,AccountName,CallerId,CalledStationId,IsExpired,FramedIpAddress,InPolicy,OutPolicy,AcctSessionId,AcctStatusType,AcctTerminateCause,Expiration,AffilateName,StateName,EventDate)%20VALUES%20
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: Sending HTTP POST to "http://10.1.5.70:8123/?query=INSERT%20INTO%20accounting.AcctPPPOE_buff%20(NASID,AcctDate,UserId,UserIndex,AcctSessionTime,AcctInputLong,AcctOutputLong,AccountName,CallerId,CalledStationId,IsExpired,FramedIpAddress,InPolicy,OutPolicy,AcctSessionId,AcctStatusType,AcctTerminateCause,Expiration,AffilateName,StateName,EventDate)%20VALUES%20"
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: EXPAND radius
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:    --> radius
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: EXPAND xxxx
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:    --> xxxx
(1651522) Tue Sep 18 16:23:12 2018: WARNING: rest_ch: Can't find &Acct-Input-Gigawords.  Using 0 as operand value
(1651522) Tue Sep 18 16:23:12 2018: WARNING: rest_ch: Can't find &Acct-Output-Gigawords.  Using 0 as operand value
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: EXPAND (%{integer:NAS-IP-Address},'%l','%{tolower:%{&User-Name}}',%{%{control:Tmp-Integer-0}:-0},%{Acct-Session-Time},%{expr:(&Acct-Input-Gigawords << 32) + &Acct-Input-Octets},%{expr: (&Acct-Output-Gigawords << 33) + &Acct-Output-Octets},'%{control:Tmp-String-0}','%{Calling-Station-Id}','%{NAS-Port-Id}',0,%{integer:Framed-IP-Address},'%{Cisco-Policy-Up}','%{Cisco-Policy-Down}','%{Acct-Unique-Session-Id}',%{integer:Acct-Status-Type},%{integer:Acct-Terminate-Cause},%{control:Tmp-Integer-1},'%{urlquote:%{control:Tmp-String-5}}','none','%Y-%m-%d')
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:    --> (183362262,'1537276987','for18@nstr51',14181357,3838,8725831,85695307,'Light +','00-27-22-ea-d2-c9','0/0/3/246.411',0,1685606857,'_Light_Plus_IN','_Light_Plus_OUT','2e080d68efd0a421a829bc7eb00667ea',3,,1538416560,'nstr16','none','2018-09-18')
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: Processing response header
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:   Status : 200 (OK)
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch:   Type   : plain (text/plain)
(1651522) Tue Sep 18 16:23:12 2018: Debug: rest_ch: Skipping attribute processing, no valid body data received
(1651522) Tue Sep 18 16:23:12 2018: Debug:       [rest_ch] = ok
(1651522) Tue Sep 18 16:23:12 2018: Debug:     } # if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  = ok
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: EXPAND %{tolower:type.%{%{Acct-Status-Type}:-none}.query}
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing:    --> type.interim-update.query
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: Using query template 'query'
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: EXPAND %{tolower:%{User-Name}}
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing:    --> for18@nstr51
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: SQL-User-Name set to 'for18@nstr51'
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: EXPAND UPDATE radacct SET FramedIPAddress = NULLIF('%{Framed-IP-Address}', '')::inet, AcctSessionTime = %{%{Acct-Session-Time}:-NULL}, AcctInterval = (%l - EXTRACT(EPOCH FROM (COALESCE(AcctUpdateTime, AcctStartTime)))), AcctUpdateTime = TO_TIMESTAMP(%l), AcctInputOctets = (('%{%{Acct-Input-Gigawords}:-0}'::bigint << 32) + '%{%{Acct-Input-Octets}:-0}'::bigint), AcctOutputOctets = (('%{%{Acct-Output-Gigawords}:-0}'::bigint << 32) + '%{%{Acct-Output-Octets}:-0}'::bigint) WHERE AcctUniqueId = '%{Acct-Unique-Session-Id}' AND AcctStopTime IS NULL
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing:    --> UPDATE radacct SET FramedIPAddress = NULLIF('100.120.81.201', '')::inet, AcctSessionTime = 3838, AcctInterval = (1537276987 - EXTRACT(EPOCH FROM (COALESCE(AcctUpdateTime, AcctStartTime)))), AcctUpdateTime = TO_TIMESTAMP(1537276987), AcctInputOctets = (('0'::bigint << 32) + '8725831'::bigint), AcctOutputOctets = (('0'::bigint << 32) + '85695307'::bigint) WHERE AcctUniqueId = '2e080d68efd0a421a829bc7eb00667ea' AND AcctStopTime IS NULL
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: Executing query: UPDATE radacct SET FramedIPAddress = NULLIF('100.120.81.201', '')::inet, AcctSessionTime = 3838, AcctInterval = (1537276987 - EXTRACT(EPOCH FROM (COALESCE(AcctUpdateTime, AcctStartTime)))), AcctUpdateTime = TO_TIMESTAMP(1537276987), AcctInputOctets = (('0'::bigint << 32) + '8725831'::bigint), AcctOutputOctets = (('0'::bigint << 32) + '85695307'::bigint) WHERE AcctUniqueId = '2e080d68efd0a421a829bc7eb00667ea' AND AcctStopTime IS NULL
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: SQL query returned: success
(1651522) Tue Sep 18 16:23:12 2018: Debug: pgsql_billing: 1 record(s) updated
(1651522) Tue Sep 18 16:23:12 2018: Debug:     [pgsql_billing] = ok
(1651522) Tue Sep 18 16:23:12 2018: Debug: attr_filter.accounting_response: EXPAND %{User-Name}
(1651522) Tue Sep 18 16:23:12 2018: Debug: attr_filter.accounting_response:    --> for18@nstr51
(1651522) Tue Sep 18 16:23:12 2018: Debug: attr_filter.accounting_response: Matched entry DEFAULT at line 12
(1651522) Tue Sep 18 16:23:12 2018: Debug:     [attr_filter.accounting_response] = updated
(1651522) Tue Sep 18 16:23:12 2018: Debug:   } # accounting = updated
(1651522) Tue Sep 18 16:23:12 2018: Debug: Sent Accounting-Response Id 97 from 192.168.248.255:1913 to 10.237.226.214:1646 length 0
(1651522) Tue Sep 18 16:23:12 2018: Debug: Finished request
(1651522) Tue Sep 18 16:23:12 2018: Debug: Cleaning up request packet ID 66 with timestamp +2218
(1651522) Tue Sep 18 16:23:13 2018: Debug: Cleaning up request packet ID 97 with timestamp +2218

while i have the max_request_time set to 30 !

linuxchips commented 6 years ago

i ll try the fix and let you know the results

linuxchips commented 6 years ago

hi, still getting the same behavior (only with the new message now) with no apparent reason. this time the whole accounting is failing due to this, here is a snippet

(926181) Wed Sep 19 16:05:37 2018: Debug: Received Accounting-Request Id 29 from 10.1.7.138:1646 to 192.168.248.255:1913 length 606
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Session-Id = "0/0/0/1951.905_0A0A0A0100BAC79A"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-Account-Info = "Vinfo[userindex:13157077,accounttype:55,expiration:1538233080,]"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-Policy-Up = "_Light_Plus_IN"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-Policy-Down = "_Light_Plus_OUT"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Framed-Protocol = PPP
(926181) Wed Sep 19 16:05:37 2018: Debug:   Framed-IP-Address = 100.106.59.48
(926181) Wed Sep 19 16:05:37 2018: Debug:   User-Name = "94@gigapro"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "connect-progress=LAN Ses Up"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "nas-tx-speed=1000000000"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "nas-tx-speed-kbps=1000000"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "nas-rx-speed=1000000000"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "nas-rx-speed-kbps=1000000"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Session-Time = 912
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Input-Octets = 202707
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Output-Octets = 6351002
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Input-Packets = 3050
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Output-Packets = 4949
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Authentic = RADIUS
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Status-Type = Interim-Update
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-Service-Info = "NOPNGDNS"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-Service-Info = "NRDRCTDNS"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Calling-Station-Id = "24-a4-3c-fa-bc-4b"
(926181) Wed Sep 19 16:05:37 2018: Debug:   NAS-Port-Type = PPPoEoQinQ
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-NAS-Port = "0/0/0/1951.905"
(926181) Wed Sep 19 16:05:37 2018: Debug:   NAS-Port = 4294952858
(926181) Wed Sep 19 16:05:37 2018: Debug:   NAS-Port-Id = "0/0/0/1951.905"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Cisco-AVPair = "client-mac-address=24a4.3cfa.bc4b"
(926181) Wed Sep 19 16:05:37 2018: Debug:   Class = 0x656c3a3666386162303066623839653139343138646235636261613637356233383364
(926181) Wed Sep 19 16:05:37 2018: Debug:   Service-Type = Framed-User
(926181) Wed Sep 19 16:05:37 2018: Debug:   NAS-IP-Address = 10.1.7.138
(926181) Wed Sep 19 16:05:37 2018: Debug:   PMIP6-Home-HN-Prefix = 3730:4237:4435::/53
(926181) Wed Sep 19 16:05:37 2018: Debug:   Acct-Delay-Time = 20
(926181) Wed Sep 19 16:05:37 2018: Debug: # Executing section preacct from file /etc/raddb/sites-enabled/ELProxy
(926181) Wed Sep 19 16:05:37 2018: Debug:   preacct {
(926181) Wed Sep 19 16:05:37 2018: Debug:     if ( !(User-Name =~ /(.*)@EL(?i)team$/) && &Acct-Status-Type != Interim-Update) {
(926181) Wed Sep 19 16:05:37 2018: Debug:     if ( !(User-Name =~ /(.*)@EL(?i)team$/) && &Acct-Status-Type != Interim-Update)  -> FALSE
(926181) Wed Sep 19 16:05:37 2018: Debug:     else {
...
...
(926181) Wed Sep 19 16:05:37 2018: Debug:             update coa {
(926181) Wed Sep 19 16:05:37 2018: Debug:               EXPAND S%{Framed-IP-Address}
(926181) Wed Sep 19 16:05:37 2018: Debug:                  --> S100.106.59.48
(926181) Wed Sep 19 16:05:37 2018: Debug:               &Cisco-Account-Info := S100.106.59.48
(926181) Wed Sep 19 16:05:37 2018: Debug:               EXPAND %{NAS-IP-Address}
(926181) Wed Sep 19 16:05:37 2018: Debug:                  --> 10.1.7.138
(926181) Wed Sep 19 16:05:37 2018: Debug:               &NAS-IP-Address := 10.1.7.138
(926181) Wed Sep 19 16:05:37 2018: Debug:               EXPAND subscriber:sub-qos-policy-in=%{control:Tmp-String-0}_IN%{control:Tmp-String-3}
(926181) Wed Sep 19 16:05:37 2018: Debug:                  --> subscriber:sub-qos-policy-in=_Light_Plus_IN_22
(926181) Wed Sep 19 16:05:37 2018: Debug:               &Cisco-AVPair += subscriber:sub-qos-policy-in=_Light_Plus_IN_22
(926181) Wed Sep 19 16:05:37 2018: Debug:               EXPAND subscriber:sub-qos-policy-out=%{control:Tmp-String-0}_OUT%{control:Tmp-String-3}
(926181) Wed Sep 19 16:05:37 2018: Debug:                  --> subscriber:sub-qos-policy-out=_Light_Plus_OUT_22
(926181) Wed Sep 19 16:05:37 2018: Debug:               &Cisco-AVPair += subscriber:sub-qos-policy-out=_Light_Plus_OUT_22
(926181) Wed Sep 19 16:05:37 2018: Debug:             } # update coa = noop
...
...
(926181) Wed Sep 19 16:05:37 2018: Debug:   } # preacct = noop
(926181) Wed Sep 19 16:05:37 2018: Debug: # Executing section accounting from file /etc/raddb/sites-enabled/ELProxy
(926181) Wed Sep 19 16:05:37 2018: Debug:   accounting {
(926181) Wed Sep 19 16:05:37 2018: Debug:     if (&Acct-Status-Type == Accounting-On || &Acct-Status-Type == Accounting-Off) {
(926181) Wed Sep 19 16:05:37 2018: Debug:     if (&Acct-Status-Type == Accounting-On || &Acct-Status-Type == Accounting-Off)  -> FALSE
(926181) Wed Sep 19 16:05:37 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ ) {
(926181) Wed Sep 19 16:05:37 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  -> TRUE
(926181) Wed Sep 19 16:05:37 2018: Debug:     if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  {
(926181) Wed Sep 19 16:05:37 2018: Debug:       update request {
(926181) Wed Sep 19 16:05:37 2018: Debug:         EXPAND %{2}
(926181) Wed Sep 19 16:05:37 2018: Debug:            --> 55
(926181) Wed Sep 19 16:05:37 2018: Debug:         &Tmp-String-0 := 55
(926181) Wed Sep 19 16:05:37 2018: Debug:       } # update request = noop
(926181) Wed Sep 19 16:05:37 2018: Debug:       update control {
(926181) Wed Sep 19 16:05:37 2018: Debug:         EXPAND %{1}
(926181) Wed Sep 19 16:05:37 2018: Debug:            --> 13157077
(926181) Wed Sep 19 16:05:37 2018: Debug:         &Tmp-Integer-0 := 13157077
(926181) Wed Sep 19 16:05:37 2018: Debug:         EXPAND %{3}
(926181) Wed Sep 19 16:05:37 2018: Debug:            --> 1538233080
(926181) Wed Sep 19 16:05:37 2018: Debug:         &Tmp-Integer-1 := 1538233080
(926181) Wed Sep 19 16:05:38 2018: Debug:         EXPAND %{User-Name}
(926181) Wed Sep 19 16:05:38 2018: Debug:            --> 94@gigapro
(926181) Wed Sep 19 16:05:38 2018: Debug:         SQL-User-Name set to '94@gigapro'
(926181) Wed Sep 19 16:05:38 2018: Debug:         Executing select query:  select af.AffiliateName from UserDetails ud inner join Affiliates af on ud.AffiliateIndex = af.AffiliateIndex where ud.UserIndex=13157077
(926181) Wed Sep 19 16:05:38 2018: Debug:         EXPAND %{%{mssql_rbs: select af.AffiliateName from UserDetails ud inner join Affiliates af on ud.AffiliateIndex = af.AffiliateIndex where ud.UserIndex=%{1}}:-none}
(926181) Wed Sep 19 16:05:38 2018: Debug:            --> gigapro
(926181) Wed Sep 19 16:05:38 2018: Debug:         &Tmp-String-5 := gigapro
(926181) Wed Sep 19 16:05:38 2018: Debug:       } # update control = noop
(926181) Wed Sep 19 16:05:38 2018: Debug:       if (&request:Tmp-String-0 == "2" || User-Name =~ /(.*)@EL(?i)team$/) {
(926181) Wed Sep 19 16:05:38 2018: Debug:       if (&request:Tmp-String-0 == "2" || User-Name =~ /(.*)@EL(?i)team$/)  -> FALSE
(926181) Wed Sep 19 16:05:38 2018: Debug:       else {
(926181) Wed Sep 19 16:05:38 2018: Debug:         modsingle[accounting]: calling cache_AccountName (rlm_cache)
(926181) Wed Sep 19 16:05:38 2018: Debug: cache_AccountName: EXPAND %{request:Tmp-String-0}
(926181) Wed Sep 19 16:05:38 2018: Debug: cache_AccountName:    --> 55
(926181) Wed Sep 19 16:05:39 2018: Debug: No home server for CoA packet.  Failing it.
(926181) Wed Sep 19 16:05:39 2018: Debug: Clearing existing &reply: attributes
(926181) Wed Sep 19 16:05:39 2018: Debug: Found Post-Proxy-Type Fail-CoA
(926181) Wed Sep 19 16:05:39 2018: Debug: Post-Proxy-Type sub-section not found.  Ignoring.
(926181) Wed Sep 19 16:05:39 2018: Debug: # Executing group from file /etc/raddb/sites-enabled/ELProxy
(926181) Wed Sep 19 16:05:39 2018: Debug: cache_AccountName: Mutex acquired
(926181) Wed Sep 19 16:05:39 2018: Debug: cache_AccountName: Found entry for "55"
(926181) Wed Sep 19 16:05:39 2018: Debug: cache_AccountName: Merging cache entry into request
(926181) Wed Sep 19 16:05:39 2018: Debug: cache_AccountName:   &control:Tmp-String-0 := "Light +"
(926181) Wed Sep 19 16:05:39 2018: Debug: cache_AccountName: Mutex released
(926181) Wed Sep 19 16:05:39 2018: WARNING: Module rlm_cache became unblocked
(926181) Wed Sep 19 16:05:39 2018: Debug:         modsingle[accounting]: returned from cache_AccountName (rlm_cache)
(926181) Wed Sep 19 16:05:39 2018: Debug:         [cache_AccountName] = ok
(926181) Wed Sep 19 16:05:39 2018: Debug:       } # else = ok
(926181) Wed Sep 19 16:05:39 2018: Debug:     } # if ( &Cisco-Account-Info =~ /Vinfo\[userindex\:([0-9]+),accounttype:([0-9]+),expiration:([0-9]+),.*\]/ )  = fail
(926181) Wed Sep 19 16:05:39 2018: Debug:   } # accounting = fail
(926181) Wed Sep 19 16:05:40 2018: Debug: Cleaning up request packet ID 229 with timestamp +1247
(926181) Wed Sep 19 16:05:59 2018: Debug: Cleaning up request packet ID 29 with timestamp +1247

before the end of the accounting there should be two more modules to be called, rlm_rest and rlm_sql, which they did not have the chance to. I think because the acounting had "failed" with no good reason.

my radius.conf looks like this:

prefix = /usr
exec_prefix = /usr
sysconfdir = /etc
localstatedir = /var
sbindir = /usr/sbin
logdir = ${localstatedir}/log/radius
raddbdir = ${sysconfdir}/raddb
radacctdir = ${logdir}/radacct
name = radiusd
confdir = ${raddbdir}
modconfdir = ${confdir}/mods-config
certdir = ${confdir}/certs
cadir   = ${confdir}/certs
run_dir = ${localstatedir}/run/${name}
db_dir = ${raddbdir}
libdir = /usr/lib64/freeradius
pidfile = ${run_dir}/${name}.pid
correct_escapes = true
max_request_time = 60
cleanup_delay = 15
max_requests = 196608
hostname_lookups = no
log {
        destination = files
        colourise = yes
        file = ${logdir}/radius.log
        syslog_facility = daemon
        stripped_names = no
        auth = no
        auth_badpass = no
        auth_goodpass = no
        msg_denied = "You are already logged in - access denied"
}
checkrad = ${sbindir}/checkrad
security {
        allow_core_dumps = yes
        max_attributes = 200
        reject_delay = 1
        status_server = yes
        allow_vulnerable_openssl = no
}
proxy_requests  = yes
$INCLUDE proxy.conf
$INCLUDE clients.conf
thread pool {
        start_servers = 32
        max_servers = 128
        min_spare_servers = 10
        max_spare_servers = 20
        max_queue_size = 1048576
        max_requests_per_server = 0
        auto_limit_acct = no
}
modules {
        $INCLUDE mods-enabled/
}
instantiate {
}
policy {
        $INCLUDE policy.d/
}
$INCLUDE sites-enabled/

should I file another bug report for this one?

alandekok commented 6 years ago

the weird thing is that the CoA packet is being sent in the middle of processing the accounting packet:

(926181) Wed Sep 19 16:05:38 2018: Debug: cache_AccountName: EXPAND %{request:Tmp-String-0}
(926181) Wed Sep 19 16:05:38 2018: Debug: cache_AccountName:    --> 55
(926181) Wed Sep 19 16:05:39 2018: Debug: No home server for CoA packet.  Failing it.
(926181) Wed Sep 19 16:05:39 2018: Debug: Clearing existing &reply: attributes

That's not supposed to happen.

alandekok commented 5 years ago

Something is catastrophically wrong with the system. It should not be running the CoA portion in the middle of running the accounting packet.

mcnewton commented 5 years ago

@linuxchips can you try latest v3.0.x HEAD please. We've been debugging this and think it may be fixed.

mcnewton commented 5 years ago

We believe this is fixed now.