Open dawnbeauty opened 5 years ago
@kcudnik Are you the best person to look into this issue with suggestions?
if there are a lot of entries in redis, it can a lot of time for apply view to dump redis DB, 8 seconds in this dump, this doesent't seem to me that much critical, but not sure if we can improve much this, and its not getting only keys, but all values as well, so basically its get keys first and then each value by key
that increase of time is probably by linux kernel slowing down high usege of cpu by sycnd, since method to dump ASIC_STATE and TEMP_STATE is exactly the same, so if those sizes of dbs are the same, then time also should be the same, take a look here: https://github.com/Azure/sonic-sairedis/blob/master/syncd/syncd_applyview.cpp#L7661
So i also performed some tests on my side, and it seems like the issues is the LENGTH of the key, i changed prefix from "TEMP_" to just "T" and there is almost no difference, i have 100k ASIC_STATE entreis and 100k TASIC_STATE entries
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 "keys" "ASIC_STATE:*"|wc -l
101475
real 0m0,137s
user 0m0,046s
sys 0m0,026s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 "keys" "TASIC_STATE:*"|wc -l
100094
real 0m0,140s
user 0m0,051s
sys 0m0,022s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 "keys" "TEMP_ASIC_STATE:*"|wc -l
100094
real 0m0,124s
user 0m0,030s
sys 0m0,029s
suddenly when we start using keys inside eval:
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 eval 'redis.call("keys", KEYS[1].."*")' 1 ASIC_STATE
(nil)
real 0m5,554s
user 0m0,000s
sys 0m0,002s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 eval 'redis.call("keys", KEYS[1].."*")' 1 TASIC_STATE
(nil)
real 0m5,428s
user 0m0,001s
sys 0m0,001s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 eval 'redis.call("keys", KEYS[1].."*")' 1 TEMP_ASIC_STATE
(nil)
real 0m36,393s
user 0m0,002s
sys 0m0,001s
we get jump from 0,13 sec to over 5 seconds !!!, i have no idea why we get such a huge jump here, but when we add also hgetall for all attributes, we gain addition 1,5 seconds:
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 --eval table_dump.lua ASIC_STATE |wc -l
1
real 0m6,460s
user 0m0,017s
sys 0m0,043s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 --eval table_dump.lua TASIC_STATE |wc -l
1
real 0m6,295s
user 0m0,017s
sys 0m0,044s
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 --eval table_dump.lua TEMP_ASIC_STATE |wc -l
1
real 0m47,449s
user 0m0,020s
sys 0m0,041s
my suspicion is that inside lua or redis, there is a structure representation for string, which is constant size like 16 chars etc, and when size is of string is longer than that, then new memory must be allocated, im not sure yet whether this is related only to table name or to entire key length, since key length ins ASIC_STATE are pretty long
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 "keys" "ASIC_STATE:*"|perl -ne 'print length($_),"\n"'|sort|uniq -c |sort -n
1 53
1 62
1 66
2 55
28 56
34 52
64 60
199 123
256 71
448 64
640 54
3600 124
21230 125
29701 127
45270 126
here with sample data we can see that most item has length with 126 chars, but with TEMP_ prefix majority of keys are over 130 characters:
sonic@centaur ~/sonic-sairedis/tests $ time redis-cli -n 1 "keys" "TEMP_ASIC_STATE:*"|perl -ne 'print length($_),"\n"'|sort|uniq -c |sort -n
1 60
1 67
28 61
32 57
32 69
199 128
3600 129
21230 130
29701 132
45270 131
i cloned https://github.com/redis/redis repo, and grepped for 128 values and there are a lot of places where this pop up, it not obvious whether this is actual cause but that would be my bet on this:
git clone https://github.com/redis/redis.git
cd redis
sonic@centaur ~/redis $ grep -rP "\b128\b"|grep -vP "\|128"|grep -v 128-bit|less|wc -l
103
but at this point is hard to tell which of those is making impact, if at all
we can change "TEMP_" to just "T" as prefix, and this will put temporary asic view, and we can get away with this for ipv4, but when we use ipv6 we wont be able to use this trick here
i also opened issue on redis repo https://github.com/redis/redis/issues/8077, so maybe it could be explained faster
Confirmed locally, that indeed this is lua hashing problem https://github.com/redis/redis/issues/8077#issuecomment-733222358 To actually solve this properly we have 2 choices here:
After merge https://github.com/redis/redis/pull/8180 this problem will go away on it's own
Unfortunately LUA in redis will not be updated since it has some backward comparability problems with new versions, but i asked for a fix in current repo and made PR here: https://github.com/redis/redis/pull/9449
ok, that PR on redis was merged, so we just need to wait for next redis package release on ubuntu
@kcudnik which official version are you referring to? Do you know if it will be released as part of Debian 11, going to SONiC 202111?
@kcudnik which official version are you referring to? Do you know if it will be released as part of Debian 11, going to SONiC 202111?
@kcudnik - Can you please refer to this question ? (Thanks in advance)
im referring to official redis server package, please refer here: https://github.com/redis/redis/pull/9449
Description
[Topo summary]: T0 with 20k ipv4 routes, each route with 2 ecmp members.
Steps to reproduce the issue
enable warm sudo config warm_restart enable system
warm restart swss/system sudo systemctl restart warm
WARMBOOT_FINALIZER found bgp did not change state to RECONCILED NOTICE root: WARMBOOT_FINALIZER : Some components didn't finish reconcile: bgp ... NOTICE root: WARMBOOT_FINALIZER : Finalizing warmboot...
Describe the results you received:
bgp did not finish reconcile. Also, other apps(e.g. lldp-syncd) will throw an error whenever they try to access db but db is busy running 'table_dump.lua'.
Describe the results you expected:
All components should change the status to reconciled.
Additional information you deem important (e.g. issue happens only occasionally):
debug
Via the syslog, we found that fpmsyncd throw a exception due to redis is busy running a script, when fpmsyncd try to set RECONCILED to state db.
we do not clearly known why redis dump TEMP_ASIC_STATE(8.13s) 7 times slower than do the same action with ASIC_STATE(1.14s), while objects of the two table are almost the same.
test more with 22694 objects in ASIC_STATE table, 22559 objects in TEMP_ASIC_STATE table
measure time of "eval" the table_dump.lua
What table_dump.lua do is:
measure the time of "keys" with lua
measure the time of "keys" without lua
The test result show us
KEYS ASIC_STATE: the fastest method KEYS TEMP_ASIC_STATE: nearly the same as KEYS ASIC_STATE KEYS ASIC_STATE with lua:~9 times slower than KEYS ASIC_STATE KEYS TEMP_ASIC_STATE with lua: the slowest method, ~65 times slower than KEYS ASIC_STATE
It seems like we should improve the dump method of table for prevent blocking redis too long. Or split each database to individual redis instances(processes).
Workaround
Currently, we can set lua-time-limit larger enough to prevent redis from sending reply
BUSY
to other client requests.runtime:
redis-cli -n 0 config set lua-time-limit 10000
redis.conf: lua-time-limit 10000