harmony-one / harmony-ops

Harmony Ops Master Repository.
MIT License
35 stars 25 forks source link

Lift 1,000-node limitation on bootnode #113

Closed harmony-ek closed 4 years ago

harmony-ek commented 5 years ago
harmony-ek commented 5 years ago

List of bootnode processes and their NOFILE soft/hard limits (first two columns):

$ for addr in $(cat ../configs/benchmark-*.json | jq -r '[.bootnode.server, .bootnode1.server, .bootnode3.server, .bootnode4.server] | .[] | select(. != null)' | sort -u); do echo "===> ${addr}"; hssh "${addr}" - 'for pid in $(pgrep bootnode); do sudo prlimit --noheadings --output=SOFT,HARD --pid="${pid}" --nofile | tr "\\n" " "; ps lwwp"${pid}" | tail -1; done'; done
===> 100.26.90.187
1024 4096 4     0  9803  9802  20   0 744144 31840 -      Sl   ?         15:00 ./bootnode -ip 100.26.90.187 -port 9873 -key bootnode-9873.key
1024 4096 4     0 13300 13299  20   0 3053724 105040 -    Sl   ?         40:24 ./bootnode -ip 100.26.90.187 -port 9868 -key bootnode-9868.key
1024 4096 4     0 16043 16042  20   0 743036 25544 -      Sl   ?        162:12 ./bootnode -ip 100.26.90.187 -port 9880 -key bootnode-9880.key
1024 4096 4     0 17001 17000  20   0 744400 33096 -      Sl   ?         14:38 ./bootnode -ip 100.26.90.187 -port 9876 -key bootnode-9876.key -log_conn
1024 4096 4     0 18427 18426  20   0 742836 26004 -      Sl   ?         13:53 ./bootnode -ip 100.26.90.187 -port 9872 -key bootnode-9872.key
1024 4096 4     0 20247 20246  20   0 890756 29804 -      Sl   ?         24:18 ./bootnode -ip 100.26.90.187 -port 9879 -key bootnode-9879.key -log_conn
1024 4096 4     0 21030 21029  20   0 743460 44568 -      Sl   ?         20:18 ./bootnode -ip 100.26.90.187 -port 9882 -key bootnode-9882.key
1024 4096 4     0 22490 22489  20   0 2980024 113376 -    Sl   ?         28:00 ./bootnode -ip 100.26.90.187 -port 9877 -key bootnode-9877.key
1024 4096 4     0 26555 26554  20   0 2980064 118528 -    Sl   ?          2:34 ./bootnode -ip 100.26.90.187 -port 9875 -key bootnode-9875.key
1024 4096 4     0 26559 26558  20   0 886508 100152 -     Sl   ?         17:31 ./bootnode -ip 100.26.90.187 -port 9842 -key bootnode-9842.key -log_conn
1024 4096 4     0 26902 26901  20   0 2979864 116640 -    Sl   ?          0:05 ./bootnode -ip 100.26.90.187 -port 9871 -key bootnode-9871.key
1024 4096 4     0 27922 27921  20   0 818036 28704 -      Sl   ?         21:56 ./bootnode -ip 100.26.90.187 -port 9870 -key bootnode-9870.key -log_conn
4096 4096 4     0 31221 31220  20   0 1030352 190148 -    Sl   ?        1336:48 ./bootnode -ip 100.26.90.187 -port 9874 -key bootnode-9874.key -log_conn
===> 13.113.101.219
1024 4096 4     0  2119  2118  20   0 3054396 76688 -     Sl   ?         52:41 ./bootnode -ip 13.113.101.219 -port 12018 -key bootnode-12018.key -log_conn
1024 4096 4     0  2250  2249  20   0 743400 10640 -      Sl   ?          6:45 ./bootnode -ip 13.113.101.219 -port 9868 -key bootnode-9868.key -log_conn
4096 4096 4     0 10242 10241  20   0 956556 181000 -     Sl   ?        1667:31 ./bootnode -ip 13.113.101.219 -port 12019 -key bootnode-12019.key -log_conn
1024 4096 4     0 11987 11986  20   0 956160 154572 -     Sl   ?        551:37 ./bootnode -ip 13.113.101.219 -port 9867 -key bootnode-9867.key -log_conn
1024 4096 4     0 20996 20995  20   0 812776 38584 -      Sl   ?         20:18 ./bootnode -ip 13.113.101.219 -port 9842 -key bootnode-9842.key -log_conn
1024 4096 4     0 24610 24609  20   0 744400 20820 -      Sl   ?         15:49 ./bootnode -ip 13.113.101.219 -port 13019 -key bootnode-13019.key -log_conn
===> 52.40.84.2
1024 4096 4     0  7478  7477  20   0 3050276 193040 -    Sl   ?         49:51 ./bootnode -ip 52.40.84.2 -port 9889 -key bootnode-9889.key -log_conn
1024 4096 4     0 23954 23953  20   0 3192820 227468 -    Sl   ?        227:36 ./bootnode -ip 52.40.84.2 -port 9867 -key bootnode-9867.key -log_conn
===> 54.213.43.194
1024 4096 4     0   394   393  20   0 742836 25724 -      Sl   ?         13:01 ./bootnode -ip 54.213.43.194 -port 9872 -key bootnode-9872.key
1024 4096 4     0  2657  2656  20   0 2979768 108164 -    Sl   ?         26:42 ./bootnode -ip 54.213.43.194 -port 9877 -key bootnode-9877.key
1024 4096 4     0  3812  3811  20   0 744240 28420 -      Sl   ?         21:37 ./bootnode -ip 54.213.43.194 -port 9870 -key bootnode-9870.key -log_conn
1024 4096 4     0  5516  5515  20   0 886508 39244 -      Sl   ?         17:16 ./bootnode -ip 54.213.43.194 -port 9842 -key bootnode-9842.key -log_conn
4096 4096 4     0  7148  7147  20   0 1030096 186336 -    Sl   ?        1249:10 ./bootnode -ip 54.213.43.194 -port 9874 -key bootnode-9874.key -log_conn
1024 4096 4     0  8509  8508  20   0 2980064 115924 -    Sl   ?          2:18 ./bootnode -ip 54.213.43.194 -port 9875 -key bootnode-9875.key
1024 4096 4     0  8979  8978  20   0 3053852 112152 -    Sl   ?          0:04 ./bootnode -ip 54.213.43.194 -port 9871 -key bootnode-9871.key
1024 4096 4     0 18959 18958  20   0 743952 27484 -      Sl   ?         14:10 ./bootnode -ip 54.213.43.194 -port 9873 -key bootnode-9873.key
1024 4096 4     0 22620 22619  20   0 816768 24096 -      Sl   ?        187:15 ./bootnode -ip 54.213.43.194 -port 9880 -key bootnode-9880.key
1024 4096 4     0 26011 26010  20   0 744656 31548 -      Sl   ?         16:06 ./bootnode -ip 54.213.43.194 -port 9876 -key bootnode-9876.key -log_conn
1024 4096 4     0 27629 27628  20   0 2979992 105852 -    Sl   ?         36:44 ./bootnode -ip 54.213.43.194 -port 9868 -key bootnode-9868.key
1024 4096 4     0 28700 28699  20   0 817280 30628 -      Sl   ?         24:54 ./bootnode -ip 54.213.43.194 -port 9879 -key bootnode-9879.key -log_conn
1024 4096 4     0 29153 29152  20   0 2978168 25700 -     Sl   ?          5:10 ./bootnode -ip 54.213.43.194 -port 9869 -key bootnode-9869.key
1024 4096 4     0 30041 30040  20   0 743716 45928 -      Sl   ?         19:43 ./bootnode -ip 54.213.43.194 -port 9882 -key bootnode-9882.key
===> 54.86.126.90
1024 4096 4     0  7914  7913  20   0 3050276 191736 -    Sl   ?         45:13 ./bootnode -ip 54.86.126.90 -port 9889 -key bootnode-9889.key -log_conn
===> 99.81.170.167
4096 4096 4     0  3223  3222  20   0 956556 182532 -     Sl   ?        1418:14 ./bootnode -ip 99.81.170.167 -port 12019 -key bootnode-12019.key -log_conn
1024 4096 4     0  6472  6471  20   0 744400 21824 -      Sl   ?         16:48 ./bootnode -ip 99.81.170.167 -port 13019 -key bootnode-13019.key -log_conn
1024 4096 4     0 16228 16227  20   0 743400 18212 -      Sl   ?          6:44 ./bootnode -ip 99.81.170.167 -port 9868 -key bootnode-9868.key -log_conn
1024 4096 4     0 24224 24223  20   0 956160 158332 -     Sl   ?        576:58 ./bootnode -ip 99.81.170.167 -port 9867 -key bootnode-9867.key -log_conn
1024 4096 4     0 26985 26984  20   0 886508 100088 -     Sl   ?         21:37 ./bootnode -ip 99.81.170.167 -port 9842 -key bootnode-9842.key -log_conn
===> jenkins.harmony.one
1024 4096 4     0   325   324  20   0 1184824 18684 -     Sl   ?         56:34 ./bootnode -ip 54.183.5.66 -port 9872 -key bootnode-9872.key
1024 4096 4     0  4367  4366  20   0 1258568 17360 -     Sl   ?         55:16 ./bootnode -ip jenkins.harmony.one -port 9874 -key bootnode-9874.key
1024 4096 4     0  9793  9792  20   0 1258548 11608 -     Sl   ?         55:19 ./bootnode -ip jenkins.harmony.one -port 9876 -key bootnode-9876.key
1024 4096 4     0 17462 17460  20   0 1184852 20160 -     Sl   ?         58:14 ./bootnode -ip 54.183.5.66 -port 9875 -key bootnode-9875.key
1024 4096 4     0 22417 22415  20   0 1112460 20880 -     Sl   ?         56:32 ./bootnode -ip jenkins.harmony.one -port 9873 -key bootnode-9873.key

4 processes are already at 4096/4096 (mainnet); all others are at 1024/4096. Raising them to 4096.

harmony-ek commented 5 years ago

Bumped up to 4096/4096:

$ for addr in $(cat ../configs/benchmark-*.json | jq -r '[.bootnode.server, .bootnode1.server, .bootnode3.server, .bootnode4.server] | .[] | select(. != null)' | sort -u); do echo "===> ${addr}"; hssh "${addr}" - 'for pid in $(pgrep bootnode); do sudo prlimit --noheadings --output=SOFT,HARD --pid="${pid}" --nofile=4096; done'; done
===> 100.26.90.187
===> 13.113.101.219
===> 52.40.84.2
===> 54.213.43.194
===> 54.86.126.90
===> 99.81.170.167
===> jenkins.harmony.one

Checking the new values.

harmony-ek commented 5 years ago

All bootnodes now have 4096 soft/hard caps:

$ for addr in $(cat ../configs/benchmark-*.json | jq -r '[.bootnode.server, .bootnode1.server, .bootnode3.server, .bootnode4.server] | .[] | select(. != null)' | sort -u); do echo "===> ${addr}"; hssh "${addr}" - 'for pid in $(pgrep bootnode); do sudo prlimit --noheadings --output=SOFT,HARD --pid="${pid}" --nofile | tr "\\n" " "; ps lwwp"${pid}" | tail -1; done'; done
===> 100.26.90.187
4096 4096 4     0  9803  9802  20   0 744144 31840 -      Sl   ?         15:00 ./bootnode -ip 100.26.90.187 -port 9873 -key bootnode-9873.key
4096 4096 4     0 13300 13299  20   0 3053724 105040 -    Sl   ?         40:25 ./bootnode -ip 100.26.90.187 -port 9868 -key bootnode-9868.key
4096 4096 4     0 16043 16042  20   0 743036 25544 -      Sl   ?        162:12 ./bootnode -ip 100.26.90.187 -port 9880 -key bootnode-9880.key
4096 4096 4     0 17001 17000  20   0 744400 33096 -      Sl   ?         14:38 ./bootnode -ip 100.26.90.187 -port 9876 -key bootnode-9876.key -log_conn
4096 4096 4     0 18427 18426  20   0 742836 26004 -      Sl   ?         13:54 ./bootnode -ip 100.26.90.187 -port 9872 -key bootnode-9872.key
4096 4096 4     0 20247 20246  20   0 890756 29804 -      Sl   ?         24:18 ./bootnode -ip 100.26.90.187 -port 9879 -key bootnode-9879.key -log_conn
4096 4096 4     0 21030 21029  20   0 743460 44568 -      Sl   ?         20:19 ./bootnode -ip 100.26.90.187 -port 9882 -key bootnode-9882.key
4096 4096 4     0 22490 22489  20   0 2980024 113376 -    Sl   ?         28:01 ./bootnode -ip 100.26.90.187 -port 9877 -key bootnode-9877.key
4096 4096 4     0 26555 26554  20   0 2980064 118528 -    Sl   ?          2:34 ./bootnode -ip 100.26.90.187 -port 9875 -key bootnode-9875.key
4096 4096 4     0 26559 26558  20   0 886508 100152 -     Sl   ?         17:31 ./bootnode -ip 100.26.90.187 -port 9842 -key bootnode-9842.key -log_conn
4096 4096 4     0 26902 26901  20   0 2979864 116640 -    Sl   ?          0:06 ./bootnode -ip 100.26.90.187 -port 9871 -key bootnode-9871.key
4096 4096 4     0 27922 27921  20   0 818036 28704 -      Sl   ?         21:56 ./bootnode -ip 100.26.90.187 -port 9870 -key bootnode-9870.key -log_conn
4096 4096 4     0 31221 31220  20   0 1030352 190148 -    Sl   ?        1336:53 ./bootnode -ip 100.26.90.187 -port 9874 -key bootnode-9874.key -log_conn
===> 13.113.101.219
4096 4096 4     0  2119  2118  20   0 3054396 76680 -     Sl   ?         52:42 ./bootnode -ip 13.113.101.219 -port 12018 -key bootnode-12018.key -log_conn
4096 4096 4     0  2250  2249  20   0 743400 10624 -      Sl   ?          6:45 ./bootnode -ip 13.113.101.219 -port 9868 -key bootnode-9868.key -log_conn
4096 4096 4     0 10242 10241  20   0 956556 181000 -     Sl   ?        1667:36 ./bootnode -ip 13.113.101.219 -port 12019 -key bootnode-12019.key -log_conn
4096 4096 4     0 11987 11986  20   0 956160 153940 -     Sl   ?        551:41 ./bootnode -ip 13.113.101.219 -port 9867 -key bootnode-9867.key -log_conn
4096 4096 4     0 20996 20995  20   0 812776 38584 -      Sl   ?         20:18 ./bootnode -ip 13.113.101.219 -port 9842 -key bootnode-9842.key -log_conn
4096 4096 4     0 24610 24609  20   0 744400 20820 -      Sl   ?         15:49 ./bootnode -ip 13.113.101.219 -port 13019 -key bootnode-13019.key -log_conn
===> 52.40.84.2
4096 4096 4     0  7478  7477  20   0 3050276 193304 -    Sl   ?         49:53 ./bootnode -ip 52.40.84.2 -port 9889 -key bootnode-9889.key -log_conn
4096 4096 4     0 23954 23953  20   0 3192820 227468 -    Sl   ?        227:40 ./bootnode -ip 52.40.84.2 -port 9867 -key bootnode-9867.key -log_conn
===> 54.213.43.194
4096 4096 4     0   394   393  20   0 742836 25724 -      Sl   ?         13:02 ./bootnode -ip 54.213.43.194 -port 9872 -key bootnode-9872.key
4096 4096 4     0  2657  2656  20   0 2979768 108164 -    Sl   ?         26:42 ./bootnode -ip 54.213.43.194 -port 9877 -key bootnode-9877.key
4096 4096 4     0  3812  3811  20   0 744240 28420 -      Sl   ?         21:37 ./bootnode -ip 54.213.43.194 -port 9870 -key bootnode-9870.key -log_conn
4096 4096 4     0  5516  5515  20   0 886508 39244 -      Sl   ?         17:16 ./bootnode -ip 54.213.43.194 -port 9842 -key bootnode-9842.key -log_conn
4096 4096 4     0  7148  7147  20   0 1030096 186336 -    Sl   ?        1249:15 ./bootnode -ip 54.213.43.194 -port 9874 -key bootnode-9874.key -log_conn
4096 4096 4     0  8509  8508  20   0 2980064 115924 -    Sl   ?          2:18 ./bootnode -ip 54.213.43.194 -port 9875 -key bootnode-9875.key
4096 4096 4     0  8979  8978  20   0 3053852 112152 -    Sl   ?          0:05 ./bootnode -ip 54.213.43.194 -port 9871 -key bootnode-9871.key
4096 4096 4     0 18959 18958  20   0 743952 27484 -      Sl   ?         14:10 ./bootnode -ip 54.213.43.194 -port 9873 -key bootnode-9873.key
4096 4096 4     0 22620 22619  20   0 816768 24096 -      Sl   ?        187:15 ./bootnode -ip 54.213.43.194 -port 9880 -key bootnode-9880.key
4096 4096 4     0 26011 26010  20   0 744656 31548 -      Sl   ?         16:06 ./bootnode -ip 54.213.43.194 -port 9876 -key bootnode-9876.key -log_conn
4096 4096 4     0 27629 27628  20   0 2979992 105852 -    Sl   ?         36:45 ./bootnode -ip 54.213.43.194 -port 9868 -key bootnode-9868.key
4096 4096 4     0 28700 28699  20   0 817280 30628 -      Sl   ?         24:54 ./bootnode -ip 54.213.43.194 -port 9879 -key bootnode-9879.key -log_conn
4096 4096 4     0 29153 29152  20   0 2978168 25700 -     Sl   ?          5:10 ./bootnode -ip 54.213.43.194 -port 9869 -key bootnode-9869.key
4096 4096 4     0 30041 30040  20   0 743716 45928 -      Sl   ?         19:43 ./bootnode -ip 54.213.43.194 -port 9882 -key bootnode-9882.key
===> 54.86.126.90
4096 4096 4     0  7914  7913  20   0 3050276 191736 -    Sl   ?         45:15 ./bootnode -ip 54.86.126.90 -port 9889 -key bootnode-9889.key -log_conn
===> 99.81.170.167
4096 4096 4     0  3223  3222  20   0 956556 182532 -     Sl   ?        1418:19 ./bootnode -ip 99.81.170.167 -port 12019 -key bootnode-12019.key -log_conn
4096 4096 4     0  6472  6471  20   0 744400 21824 -      Sl   ?         16:48 ./bootnode -ip 99.81.170.167 -port 13019 -key bootnode-13019.key -log_conn
4096 4096 4     0 16228 16227  20   0 743400 18212 -      Sl   ?          6:44 ./bootnode -ip 99.81.170.167 -port 9868 -key bootnode-9868.key -log_conn
4096 4096 4     0 24224 24223  20   0 956160 158332 -     Sl   ?        577:02 ./bootnode -ip 99.81.170.167 -port 9867 -key bootnode-9867.key -log_conn
4096 4096 4     0 26985 26984  20   0 886508 100088 -     Sl   ?         21:37 ./bootnode -ip 99.81.170.167 -port 9842 -key bootnode-9842.key -log_conn
===> jenkins.harmony.one
4096 4096 4     0   325   324  20   0 1184824 18684 -     Sl   ?         56:34 ./bootnode -ip 54.183.5.66 -port 9872 -key bootnode-9872.key
4096 4096 4     0  4367  4366  20   0 1258568 17360 -     Sl   ?         55:16 ./bootnode -ip jenkins.harmony.one -port 9874 -key bootnode-9874.key
4096 4096 4     0  9793  9792  20   0 1258548 11608 -     Sl   ?         55:19 ./bootnode -ip jenkins.harmony.one -port 9876 -key bootnode-9876.key
4096 4096 4     0 17462 17460  20   0 1184852 20160 -     Sl   ?         58:14 ./bootnode -ip 54.183.5.66 -port 9875 -key bootnode-9875.key
4096 4096 4     0 22417 22415  20   0 1112460 20880 -     Sl   ?         56:32 ./bootnode -ip jenkins.harmony.one -port 9873 -key bootnode-9873.key
harmony-ek commented 5 years ago

Kernel-level FD counts (1st column) are well below their limit (3rd column); no action is necessary to adjust the max:

$ for addr in $(cat ../configs/benchmark-*.json | jq -r '[.bootnode.server, .bootnode1.server, .bootnode3.server, .bootnode4.server] | .[] | select(. != null)' | sort -u); do echo "===> ${addr}"; hssh "${addr}" - 'cat /proc/sys/fs/file-nr'; done
===> 100.26.90.187
4608    0       1609546
===> 13.113.101.219
4576    0       196009
===> 52.40.84.2
3232    0       374702
===> 54.213.43.194
4672    0       1609546
===> 54.86.126.90
2080    0       374702
===> 99.81.170.167
4160    0       196009
===> jenkins.harmony.one
2816    0       3128179
harmony-ek commented 5 years ago

harmony-one/experiment-deploy#106 under review.

AndyBoWu commented 5 years ago

I have confirmed with @harmony-ek that this fix has been applied to the Pangaea network too.