Closed krizhanovsky closed 7 years ago
Just reproduce it again on clean (restarted machine):
root@debian:~/tempesta.759/tempesta_fw/t/functional# ./run_tests.py
----------------------------------------------------------------------
Running functional tests...
----------------------------------------------------------------------
test_scheduler (sched.test_http.HttpRules) ... ok
test_scheduler (sched.test_http.HttpRulesBackupServers) ... ok
test_hash_scheduler (sched.test_hash_func.HashSchedulerFailoveredTest) ...
Tests config:
# cat tests_config.ini
[Client]
ip = 127.0.0.1
hostname = localhost
ab = ab
wrk = wrk
workdir = /tmp/client
[Tempesta]
workdir = /root/tempesta.759
ip = 127.0.0.1
hostname = localhost
user = root
config = /tmp/tempesta.conf
port = 22
[Server]
workdir = /tmp/nginx
ip = 127.0.0.1
hostname = localhost
nginx = nginx
user = root
port = 22
resources = /var/www/html/
[General]
duration = 10
concurrent_connections = 10
verbose = 1
@keshonok has found the issue reason: in https://github.com/tempesta-tech/tempesta/blob/master/tempesta_fw/sched/tfw_sched_hash.c#L235 we iterate over the list of already destroyed and freed servers. The list itself is wasn't cleared or nulled, so we work with already freed memory in the cycle.
TfwServer
knows nothing about how to free it's sched_data
field, so scheduler must be unbound from server group before servers will be destroyed.
Fixed in 825f006
I tested latest version of #783 after benchmark to reproduce #692, so a Tempesta FW instance were running. There are also running Nginx in proxy mode and Apache HTTPD on the same host with Tempesta FW, so configuration is
There were no traffic load when I started the test. The test has stuck, so I had to interrupt it with SIGTERM:
And the kernel module crashed with (there are all Tempesta FW start/stop messages just after OS booting messages):
Note that initially loaded and tested versions of Tempesta FW are essentially different builds. Disassembled
tfw_sched_hash_del_grp()
is show at the below: