lihongjie0209 / myblog

4 stars 0 forks source link

Linux性能监控: sar #231

Open lihongjie0209 opened 3 years ago

lihongjie0209 commented 3 years ago

监控CPU使用率

root@VM_45_207_centos ~# sar 1 20
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     11/20/20    _x86_64_    (1 CPU)

13:25:17        CPU     %user     %nice   %system   %iowait    %steal     %idle
13:25:18        all      1.00      0.00      1.00      4.00      0.00     94.00
13:25:19        all      1.00      0.00      2.00      1.00      0.00     96.00
13:25:20        all      1.01      0.00      1.01      0.00      0.00     97.98
13:25:21        all      2.00      0.00      2.00      7.00      0.00     89.00
13:25:22        all      1.00      0.00      1.00      1.00      0.00     97.00
13:25:23        all      1.00      0.00      1.00      0.00      0.00     98.00
13:25:24        all      0.00      0.00      1.01      2.02      0.00     96.97
13:25:25        all      2.00      0.00      1.00      5.00      0.00     92.00
^C
13:25:26        all      1.01      0.00      2.02      0.00      0.00     96.97
Average:        all      1.11      0.00      1.34      2.23      0.00     95.32

使用sysbench创造一定的负载

root@VM_45_207_centos ~# sysbench cpu --cpu-max-prime=100000000 run
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 100000000

Initializing worker threads...

Threads started!

再次查看CPU使用率


root@VM_45_207_centos ~# sar 1 
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     11/20/20    _x86_64_    (1 CPU)

13:26:22        CPU     %user     %nice   %system   %iowait    %steal     %idle
13:26:23        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:24        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:25        all     99.01      0.00      0.99      0.00      0.00      0.00
13:26:26        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:27        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:28        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:29        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:30        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:31        all     99.01      0.00      0.99      0.00      0.00      0.00
13:26:32        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:33        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:34        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:35        all     99.00      0.00      1.00      0.00      0.00      0.00
13:26:36        all     94.00      0.00      6.00      0.00      0.00      0.00
13:26:37        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:38        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:39        all     99.00      0.00      1.00      0.00      0.00      0.00
13:26:40        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:41        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:42        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:43        all     99.00      0.00      1.00      0.00      0.00      0.00
13:26:44        all    100.00      0.00      0.00      0.00      0.00      0.00
13:26:45        all     99.01      0.00      0.99      0.00      0.00      0.00
13:26:46        all    100.00      0.00      0.00      0.00      0.00      0.00
^C
13:26:46        all    100.00      0.00      0.00      0.00      0.00      0.00
Average:        all     99.51      0.00      0.49      0.00      0.00      0.00

usertime 100% 说明 sysbench没有进行系统调用, 全部在用户态运行

lihongjie0209 commented 3 years ago

监控磁盘使用率


root@VM_45_207_centos ~# sar -d 1
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     11/20/20    _x86_64_    (1 CPU)

13:28:31          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:28:32     dev253-0     60.00    296.00    480.00     12.93      0.77     12.82      0.43      2.60
13:28:32      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:28:32          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:28:33     dev253-0      1.00      8.00      0.00      8.00      0.01      6.00      6.00      0.60
13:28:33      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:28:33          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:28:34     dev253-0      1.98     39.60      0.00     20.00      0.00      1.00      1.00      0.20
13:28:34      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:28:34          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:28:35     dev253-0      1.98     15.84      0.00      8.00      0.03     13.50     13.50      2.67
13:28:35      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
^C

13:28:35          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:28:36     dev253-0      2.56     20.51      0.00      8.00      0.07     26.00     26.00      6.67
13:28:36      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:     dev253-0     14.97     83.45    108.84     12.85      0.19     12.58      1.32      1.97
Average:      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

使用fio压测磁盘

root@VM_45_207_centos ~# fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.7
Starting 1 process
TEST: Laying out IO file (1 file / 2048MiB)
Jobs: 1 (f=1): [R(1)][11.7%][r=90.0MiB/s,w=0KiB/s][r=90,w=0 IOPS][eta 00m:53s]
Jobs: 1 (f=1): [R(1)][21.7%][r=91.1MiB/s,w=0KiB/s][r=91,w=0 IOPS][eta 00m:47s] 
Jobs: 1 (f=1): [R(1)][31.7%][r=90.1MiB/s,w=0KiB/s][r=90,w=0 IOPS][eta 00m:41s] 
Jobs: 1 (f=1): [R(1)][41.7%][r=95.0MiB/s,w=0KiB/s][r=95,w=0 IOPS][eta 00m:35s] 
Jobs: 1 (f=1): [R(1)][51.7%][r=92.1MiB/s,w=0KiB/s][r=92,w=0 IOPS][eta 00m:29s] 
Jobs: 1 (f=1): [R(1)][61.7%][r=86.0MiB/s,w=0KiB/s][r=86,w=0 IOPS][eta 00m:23s] 
Jobs: 1 (f=1): [R(1)][71.7%][r=89.1MiB/s,w=0KiB/s][r=89,w=0 IOPS][eta 00m:17s] 

观察sar输出

root@VM_45_207_centos ~# sar -d 1
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     11/20/20    _x86_64_    (1 CPU)

13:33:27          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:33:28     dev253-0    257.61    869.57 153208.70    598.11    115.51    214.27      4.16    107.07
13:33:28      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:33:28          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:33:29     dev253-0    316.49   3694.85 219554.64    705.38    117.92    442.31      3.10     98.14
13:33:29      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:33:29          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:33:30     dev253-0    457.29  22866.67 215775.00    521.86    114.30    388.21      2.23    101.88
13:33:30      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

13:33:30          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:33:31     dev253-0    578.72  32144.68 296553.19    567.97    132.31    476.07      1.84    106.28
13:33:31      dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

观察pidstat 输出

root@VM_45_207_centos ~# pidstat -d 1
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     11/20/20    _x86_64_    (1 CPU)

13:37:45      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
13:37:46        0      4322 120649.50      0.00      0.00  fio
13:37:46        0      4343    118.81      0.00      0.00  nslookup
13:37:46        0      7509      3.96      0.00      0.00  YDService

13:37:46      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
13:37:47        0      4322 108544.00      0.00      0.00  fio
13:37:47        0      4343    220.00      0.00      0.00  nslookup
13:37:47        0      7509      4.00      0.00      0.00  YDService

13:37:47      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
13:37:48        0      4322  82944.00      0.00      0.00  fio
13:37:48        0      4343    812.00      0.00      0.00  nslookup
13:37:48        0      7509    152.00      0.00      0.00  YDService
13:37:48        0     18443      0.00      4.00      0.00  barad_agent

可以看到确实是fio在进行大量的io操作

lihongjie0209 commented 3 years ago

内存监控

[root@VM_45_207_centos ~]# sar -r 1
Linux 3.10.0-1127.8.2.el7.x86_64 (VM_45_207_centos)     Friday 20 November 2020     _x86_64_    (1 CPU)

01:41:58  CST kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
01:41:59  CST    284468    730352     71.97     17376    156404   1477172    145.56    507460    133396        72
01:42:00  CST    284484    730336     71.97     17376    156412   1477172    145.56    507464    133404        72
01:42:01  CST    284344    730476     71.98     17512    156436   1477176    145.56    507468    133556        76
01:42:02  CST    284344    730476     71.98     17664    156432   1477172    145.56    507488    133680        96
01:42:03  CST    284360    730460     71.98     17664    156452   1477172    145.56    507492    133704        96
01:42:04  CST    284360    730460     71.98     17664    156472   1477172    145.56    507492    133724        96
01:42:05  CST    284096    730724     72.01     17800    156480   1477172    145.56    507496    133864       104
01:42:06  CST    282844    731976     72.13     17800    156484   1479068    145.75    508468    133856       104
01:42:07  CST    282828    731992     72.13     17800    156508   1479068    145.75    508468    133876       108

字段解释

       -r     Report memory utilization statistics.  The following values are displayed:

              kbmemfree
                     Amount of free memory available in kilobytes.

              kbmemused
                     Amount of used memory in kilobytes. This does not take into account memory used by the kernel itself.

              %memused
                     Percentage of used memory.

              kbbuffers
                     Amount of memory used as buffers by the kernel in kilobytes.

              kbcached
                     Amount of memory used to cache data by the kernel in kilobytes.

              kbcommit
                     Amount of memory in kilobytes needed for current workload. This is an estimate of how much RAM/swap  is  needed  to  guarantee
                     that there never is out of memory.

              %commit
                     Percentage  of  memory  needed  for current workload in relation to the total amount of memory (RAM+swap).  This number may be
                     greater than 100% because the kernel usually overcommits memory.

              kbactive
                     Amount of active memory in kilobytes (memory that has been used more recently and usually not reclaimed unless absolutely nec‐
                     essary).

              kbinact
                     Amount  of  inactive  memory  in  kilobytes (memory which has been less recently used. It is more eligible to be reclaimed for
                     other purposes).

              kbdirty
                     Amount of memory in kilobytes waiting to get written back to the disk.
lihongjie0209 commented 3 years ago

全局IO监控

[root@node1 backend]# sar -b 1
Linux 3.10.0-957.27.2.el7.x86_64 (node1.b)  11/21/2020  _x86_64_    (2 CPU)

11:08:00 AM       tps      rtps      wtps   bread/s   bwrtn/s
11:08:01 AM     14.00      0.00     14.00      0.00    183.00
11:08:02 AM     31.00      0.00     31.00      0.00    726.00
11:08:03 AM      6.00      0.00      6.00      0.00   1046.00
11:08:04 AM     48.00      0.00     48.00      0.00   4258.00
11:08:05 AM      8.00      0.00      8.00      0.00    153.00
^C

11:08:05 AM      7.69      0.00      7.69      0.00    246.15
Average:        20.72      0.00     20.72      0.00   1222.43
       -b     Report I/O and transfer rate statistics.  The following values are displayed:

              tps
                     Total number of transfers per second that were issued to physical devices.  A transfer is an
                     I/O  request  to  a physical device. Multiple logical requests can be combined into a single
                     I/O request to the device.  A transfer is of indeterminate size.

              rtps
                     Total number of read requests per second issued to physical devices.

              wtps
                     Total number of write requests per second issued to physical devices.

              bread/s
                     Total amount of data read from the devices in blocks per second.  Blocks are  equivalent  to
                     sectors and therefore have a size of 512 bytes.

              bwrtn/s
                     Total amount of data written to devices in blocks per second.
lihongjie0209 commented 3 years ago

Display context switch per second (sar -w)

This reports the total number of processes created per second, and total number of context switches per second. “1 3” reports for every 1 seconds a total of 3 times.

$ sar -w 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

08:32:24 AM    proc/s   cswch/s
08:32:25 AM      3.00     53.00
08:32:26 AM      4.00     61.39
08:32:27 AM      2.00     57.00

Following are few variations:

lihongjie0209 commented 3 years ago

Report network statistics (sar -n)

This reports various network statistics. For example: number of packets received (transmitted) through the network card, statistics of packet failure etc.,. “1 3” reports for every 1 seconds a total of 3 times.

sar -n KEYWORD

KEYWORD can be one of the following:

$ sar -n DEV 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:11:13 PM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
01:11:14 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:11:14 PM      eth0    342.57    342.57  93923.76 141773.27      0.00      0.00      0.00
01:11:14 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
lihongjie0209 commented 3 years ago

Reports run queue and load average (sar -q)

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3” reports for every 1 seconds a total of 3 times.

$ sar -q 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

06:28:53 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
06:28:54 AM         0       230      2.00      3.00      5.00         0
06:28:55 AM         2       210      2.01      3.15      5.15         0
06:28:56 AM         2       230      2.12      3.12      5.12         0
Average:            3       230      3.12      3.12      5.12         0

Note: The “blocked” column displays the number of tasks that are currently blocked and waiting for I/O operation to complete.

Following are few variations:

       -q     Report queue length and load averages. The following values are displayed:

              runq-sz
                     Run queue length (number of tasks waiting for run time).

              plist-sz
                     Number of tasks in the task list.

              ldavg-1
                     System load average for the last minute.  The load average is calculated as the average num‐
                     ber of runnable or running tasks (R state), and the number of tasks in uninterruptible sleep
                     (D state) over the specified interval.

              ldavg-5
                     System load average for the past 5 minutes.

              ldavg-15
                     System load average for the past 15 minutes.

              blocked
                     Number of tasks currently blocked, waiting for I/O to complete.