awslabs / aws-c-s3

C99 library implementation for communicating with the S3 service, designed for maximizing throughput on high bandwidth EC2 instances.
Apache License 2.0
93 stars 37 forks source link

Disable CPU Group Pinning for Body Streaming ELG #403

Closed waahm7 closed 7 months ago

waahm7 commented 7 months ago

Description of changes: We try to pin the body streaming ELG to CPU group 1 if multiple CPU groups are available. On a c5.24xlarge EC2 instance, we have the following CPU groups:

Command: numactl --hardware
output:
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 0 size: 94640 MB
node 0 free: 80464 MB
node 1 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
node 1 size: 94655 MB
node 1 free: 41849 MB
node distances:
node   0   1
  0:  10  21
  1:  21  10

We try to pin our body streaming ELG to the CPUs of node 1, which range from 24 to 95 in CPU IDs. At this location, we attempt to create a thread pinned to each CPU ID from the list. However, some CPUs might be restricted for the process. If any of the CPU IDs is restricted, pthread_create fails for that particular CPU ID here with an error code of 22 (EINVAL = Invalid Argument error), which is treated as a fatal error in our codebase.

I have run some benchmarks with and without CPU group pinning on a c5.24xlarge and observed no difference in performance. Therefore, this PR disables CPU group pinning. Here are the benchmark results:

Download 30gb file with Disk IO:
          - With thread pinning:
              Run:1 Secs:48.960 Gb/s:5.3 Mb/s:5263.4 GiB/s:0.6 MiB/s:627.4
              Run:2 Secs:37.693 Gb/s:6.8 Mb/s:6836.7 GiB/s:0.8 MiB/s:815.0
              Run:3 Secs:35.690 Gb/s:7.2 Mb/s:7220.4 GiB/s:0.8 MiB/s:860.7
              Run:4 Secs:34.340 Gb/s:7.5 Mb/s:7504.2 GiB/s:0.9 MiB/s:894.6
              Run:5 Secs:30.866 Gb/s:8.3 Mb/s:8349.0 GiB/s:1.0 MiB/s:995.3
              Run:6 Secs:29.064 Gb/s:8.9 Mb/s:8866.5 GiB/s:1.0 MiB/s:1057.0
              Run:7 Secs:27.199 Gb/s:9.5 Mb/s:9474.4 GiB/s:1.1 MiB/s:1129.4
              Run:8 Secs:26.908 Gb/s:9.6 Mb/s:9576.9 GiB/s:1.1 MiB/s:1141.7
              Run:9 Secs:25.682 Gb/s:10.0 Mb/s:10034.1 GiB/s:1.2 MiB/s:1196.2
              Run:10 Secs:26.602 Gb/s:9.7 Mb/s:9687.2 GiB/s:1.1 MiB/s:1154.8
              Overall stats; Throughput Mean:7978.1 Mb/s Throughput Variance:5527.8 Mb/s Duration Mean:32.301 s Duration Variance:46.619 s Peak RSS:272.594 Mb
          - Without thread pinning:
              Run:1 Secs:43.185 Gb/s:6.0 Mb/s:5967.4 GiB/s:0.7 MiB/s:711.4
              Run:2 Secs:40.669 Gb/s:6.3 Mb/s:6336.4 GiB/s:0.7 MiB/s:755.4
              Run:3 Secs:38.032 Gb/s:6.8 Mb/s:6775.9 GiB/s:0.8 MiB/s:807.7
              Run:4 Secs:33.514 Gb/s:7.7 Mb/s:7689.3 GiB/s:0.9 MiB/s:916.6
              Run:5 Secs:32.388 Gb/s:8.0 Mb/s:7956.6 GiB/s:0.9 MiB/s:948.5
              Run:6 Secs:29.912 Gb/s:8.6 Mb/s:8615.2 GiB/s:1.0 MiB/s:1027.0
              Run:7 Secs:27.771 Gb/s:9.3 Mb/s:9279.3 GiB/s:1.1 MiB/s:1106.2
              Run:8 Secs:27.140 Gb/s:9.5 Mb/s:9495.0 GiB/s:1.1 MiB/s:1131.9
              Run:9 Secs:28.598 Gb/s:9.0 Mb/s:9011.0 GiB/s:1.0 MiB/s:1074.2
              Run:10 Secs:29.456 Gb/s:8.7 Mb/s:8748.5 GiB/s:1.0 MiB/s:1042.9
              Overall stats; Throughput Mean:7793.3 Mb/s Throughput Variance:8837.5 Mb/s Duration Mean:33.067 s Duration Variance:29.160 s Peak RSS:272.355 Mb
Download 30gb file without Disk IO:
          - With thread pinning:
              Run:1 Secs:11.322 Gb/s:22.8 Mb/s:22761.8 GiB/s:2.6 MiB/s:2713.4
              Run:2 Secs:11.222 Gb/s:23.0 Mb/s:22964.3 GiB/s:2.7 MiB/s:2737.6
              Run:3 Secs:11.037 Gb/s:23.3 Mb/s:23348.9 GiB/s:2.7 MiB/s:2783.4
              Run:4 Secs:10.971 Gb/s:23.5 Mb/s:23489.1 GiB/s:2.7 MiB/s:2800.1
              Run:5 Secs:10.933 Gb/s:23.6 Mb/s:23569.8 GiB/s:2.7 MiB/s:2809.7
              Run:6 Secs:11.077 Gb/s:23.3 Mb/s:23264.1 GiB/s:2.7 MiB/s:2773.3
              Run:7 Secs:10.978 Gb/s:23.5 Mb/s:23474.2 GiB/s:2.7 MiB/s:2798.3
              Run:8 Secs:10.944 Gb/s:23.5 Mb/s:23547.7 GiB/s:2.7 MiB/s:2807.1
              Run:9 Secs:11.184 Gb/s:23.0 Mb/s:23042.0 GiB/s:2.7 MiB/s:2746.8
              Run:10 Secs:11.307 Gb/s:22.8 Mb/s:22791.4 GiB/s:2.7 MiB/s:2717.0
              Overall stats; Throughput Mean:23221.6 Mb/s Throughput Variance:12765702.7 Mb/s Duration Mean:11.097 s Duration Variance:0.020 s Peak RSS:8115.359 Mb
          - Without thread pinning:
              Run:1 Secs:11.201 Gb/s:23.0 Mb/s:23006.7 GiB/s:2.7 MiB/s:2742.6
              Run:2 Secs:10.884 Gb/s:23.7 Mb/s:23676.2 GiB/s:2.8 MiB/s:2822.4
              Run:3 Secs:10.922 Gb/s:23.6 Mb/s:23595.5 GiB/s:2.7 MiB/s:2812.8
              Run:4 Secs:11.021 Gb/s:23.4 Mb/s:23382.4 GiB/s:2.7 MiB/s:2787.4
              Run:5 Secs:10.982 Gb/s:23.5 Mb/s:23464.8 GiB/s:2.7 MiB/s:2797.2
              Run:6 Secs:11.127 Gb/s:23.2 Mb/s:23159.3 GiB/s:2.7 MiB/s:2760.8
              Run:7 Secs:10.949 Gb/s:23.5 Mb/s:23535.3 GiB/s:2.7 MiB/s:2805.6
              Run:8 Secs:11.030 Gb/s:23.4 Mb/s:23362.9 GiB/s:2.7 MiB/s:2785.1
              Run:9 Secs:11.233 Gb/s:22.9 Mb/s:22941.7 GiB/s:2.7 MiB/s:2734.9
              Run:10 Secs:11.015 Gb/s:23.4 Mb/s:23394.3 GiB/s:2.7 MiB/s:2788.8
              Overall stats; Throughput Mean:23349.6 Mb/s Throughput Variance:21224625.7 Mb/s Duration Mean:11.037 s Duration Variance:0.012 s Peak RSS:8116.688 Mb
Upload 30gb file with parallel reads:
          - With thread pinning
              Run:1 Secs:13.535 Gb/s:19.0 Mb/s:19039.7 GiB/s:2.2 MiB/s:2269.7
              Run:2 Secs:13.521 Gb/s:19.1 Mb/s:19059.6 GiB/s:2.2 MiB/s:2272.1
              Run:3 Secs:13.976 Gb/s:18.4 Mb/s:18439.0 GiB/s:2.1 MiB/s:2198.1
              Run:4 Secs:13.487 Gb/s:19.1 Mb/s:19107.4 GiB/s:2.2 MiB/s:2277.8
              Run:5 Secs:12.784 Gb/s:20.2 Mb/s:20158.4 GiB/s:2.3 MiB/s:2403.1
              Run:6 Secs:13.114 Gb/s:19.7 Mb/s:19651.3 GiB/s:2.3 MiB/s:2342.6
              Run:7 Secs:12.607 Gb/s:20.4 Mb/s:20440.4 GiB/s:2.4 MiB/s:2436.7
              Run:8 Secs:13.412 Gb/s:19.2 Mb/s:19213.3 GiB/s:2.2 MiB/s:2290.4
              Run:9 Secs:13.374 Gb/s:19.3 Mb/s:19268.9 GiB/s:2.2 MiB/s:2297.0
              Run:10 Secs:13.263 Gb/s:19.4 Mb/s:19429.6 GiB/s:2.3 MiB/s:2316.2
              Overall stats; Throughput Mean:19365.3 Mb/s Throughput Variance:1846730.8 Mb/s Duration Mean:13.307 s Duration Variance:0.140 s Peak RSS:4071.148 Mb
          - Without thread pinning
              Run:1 Secs:13.580 Gb/s:19.0 Mb/s:18976.4 GiB/s:2.2 MiB/s:2262.2
              Run:2 Secs:13.334 Gb/s:19.3 Mb/s:19327.1 GiB/s:2.2 MiB/s:2304.0
              Run:3 Secs:12.906 Gb/s:20.0 Mb/s:19967.5 GiB/s:2.3 MiB/s:2380.3
              Run:4 Secs:13.505 Gb/s:19.1 Mb/s:19081.5 GiB/s:2.2 MiB/s:2274.7
              Run:5 Secs:12.799 Gb/s:20.1 Mb/s:20133.9 GiB/s:2.3 MiB/s:2400.1
              Run:6 Secs:13.471 Gb/s:19.1 Mb/s:19129.7 GiB/s:2.2 MiB/s:2280.4
              [ERROR] [2024-01-24T18:34:51Z] [00007f0db2bfd640] [S3MetaRequest] - id=0x28ed160 Request failed from error 14341 (Response code indicates internal server error). (request=0x7f0dac3548c0, response status=500). Try to setup a retry.
              Run:7 Secs:12.514 Gb/s:20.6 Mb/s:20592.5 GiB/s:2.4 MiB/s:2454.8
              Run:8 Secs:13.533 Gb/s:19.0 Mb/s:19041.8 GiB/s:2.2 MiB/s:2270.0
              Run:9 Secs:13.397 Gb/s:19.2 Mb/s:19236.0 GiB/s:2.2 MiB/s:2293.1
              Run:10 Secs:12.785 Gb/s:20.2 Mb/s:20156.7 GiB/s:2.3 MiB/s:2402.9
              Overall stats; Throughput Mean:19548.7 Mb/s Throughput Variance:1887365.0 Mb/s Duration Mean:13.182 s Duration Variance:0.137 s Peak RSS:4064.930 Mb
Download Caltech 256 with Disk IO:
          - With thread pinning
          Run:1 Secs:9.834 Gb/s:0.9 Mb/s:943.1 GiB/s:0.1 MiB/s:112.4
              Run:2 Secs:30.528 Gb/s:0.3 Mb/s:303.8 GiB/s:0.0 MiB/s:36.2
              Run:3 Secs:10.620 Gb/s:0.9 Mb/s:873.4 GiB/s:0.1 MiB/s:104.1
              Run:4 Secs:12.059 Gb/s:0.8 Mb/s:769.1 GiB/s:0.1 MiB/s:91.7
              Run:5 Secs:12.427 Gb/s:0.7 Mb/s:746.3 GiB/s:0.1 MiB/s:89.0
              Run:6 Secs:9.534 Gb/s:1.0 Mb/s:972.9 GiB/s:0.1 MiB/s:116.0
              Run:7 Secs:16.749 Gb/s:0.6 Mb/s:553.8 GiB/s:0.1 MiB/s:66.0
              Run:8 Secs:10.640 Gb/s:0.9 Mb/s:871.7 GiB/s:0.1 MiB/s:103.9
              Run:9 Secs:10.166 Gb/s:0.9 Mb/s:912.3 GiB/s:0.1 MiB/s:108.8
              Run:10 Secs:20.255 Gb/s:0.5 Mb/s:457.9 GiB/s:0.1 MiB/s:54.6
              Overall stats; Throughput Mean:649.4 Mb/s Throughput Variance:231.9 Mb/s Duration Mean:14.281 s Duration Variance:40.003 s Peak RSS:297.719 Mb
          - Without thread pinning
              Run:1 Secs:11.537 Gb/s:0.8 Mb/s:804.0 GiB/s:0.1 MiB/s:95.8
              Run:2 Secs:11.942 Gb/s:0.8 Mb/s:776.7 GiB/s:0.1 MiB/s:92.6
              Run:3 Secs:12.200 Gb/s:0.8 Mb/s:760.2 GiB/s:0.1 MiB/s:90.6
              Run:4 Secs:9.816 Gb/s:0.9 Mb/s:944.9 GiB/s:0.1 MiB/s:112.6
              Run:5 Secs:18.552 Gb/s:0.5 Mb/s:500.0 GiB/s:0.1 MiB/s:59.6
              Run:6 Secs:10.725 Gb/s:0.9 Mb/s:864.8 GiB/s:0.1 MiB/s:103.1
              Run:7 Secs:10.971 Gb/s:0.8 Mb/s:845.4 GiB/s:0.1 MiB/s:100.8
              Run:8 Secs:12.203 Gb/s:0.8 Mb/s:760.0 GiB/s:0.1 MiB/s:90.6
              Run:9 Secs:11.721 Gb/s:0.8 Mb/s:791.3 GiB/s:0.1 MiB/s:94.3
              Run:10 Secs:10.617 Gb/s:0.9 Mb/s:873.6 GiB/s:0.1 MiB/s:104.1
              Overall stats; Throughput Mean:771.1 Mb/s Throughput Variance:1761.4 Mb/s Duration Mean:12.028 s Duration Variance:5.266 s Peak RSS:311.340 Mb

Benchmarks on c5n.18xlarge:

Command: numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
node 0 size: 94590 MB
node 0 free: 44981 MB
node 1 cpus: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 1 size: 94710 MB
node 1 free: 79679 MB
node distances:
node   0   1
  0:  10  21
  1:  21  10
- Download 30Gb file with DiskIO
                - With thread pinning
                Run:1 Secs:45.140 Gb/s:5.7 Mb/s:5708.9 GiB/s:0.7 MiB/s:680.6
                Run:2 Secs:34.275 Gb/s:7.5 Mb/s:7518.5 GiB/s:0.9 MiB/s:896.3
                Run:3 Secs:30.125 Gb/s:8.6 Mb/s:8554.3 GiB/s:1.0 MiB/s:1019.8
                Run:4 Secs:28.200 Gb/s:9.1 Mb/s:9138.3 GiB/s:1.1 MiB/s:1089.4
                Run:5 Secs:27.908 Gb/s:9.2 Mb/s:9234.0 GiB/s:1.1 MiB/s:1100.8
                Run:6 Secs:25.750 Gb/s:10.0 Mb/s:10007.7 GiB/s:1.2 MiB/s:1193.0
                Run:7 Secs:24.960 Gb/s:10.3 Mb/s:10324.6 GiB/s:1.2 MiB/s:1230.8
                Run:8 Secs:25.346 Gb/s:10.2 Mb/s:10167.2 GiB/s:1.2 MiB/s:1212.0
                Run:9 Secs:24.575 Gb/s:10.5 Mb/s:10486.3 GiB/s:1.2 MiB/s:1250.1
                Run:10 Secs:25.754 Gb/s:10.0 Mb/s:10006.1 GiB/s:1.2 MiB/s:1192.8
                Overall stats; Throughput Mean:8824.3 Mb/s Throughput Variance:7131.2 Mb/s Duration Mean:29.203 s Duration Variance:36.137 s Peak RSS:277.336 Mb
                - Without thread pinning
                Run:1 Secs:24.900 Gb/s:10.3 Mb/s:10349.4 GiB/s:1.2 MiB/s:1233.7
                Run:2 Secs:25.861 Gb/s:10.0 Mb/s:9964.8 GiB/s:1.2 MiB/s:1187.9
                Run:3 Secs:25.940 Gb/s:9.9 Mb/s:9934.5 GiB/s:1.2 MiB/s:1184.3
                Run:4 Secs:25.466 Gb/s:10.1 Mb/s:10119.2 GiB/s:1.2 MiB/s:1206.3
                Run:5 Secs:25.104 Gb/s:10.3 Mb/s:10265.2 GiB/s:1.2 MiB/s:1223.7
                Run:6 Secs:24.922 Gb/s:10.3 Mb/s:10340.0 GiB/s:1.2 MiB/s:1232.6
                Run:7 Secs:24.549 Gb/s:10.5 Mb/s:10497.4 GiB/s:1.2 MiB/s:1251.4
                Run:8 Secs:26.653 Gb/s:9.7 Mb/s:9668.7 GiB/s:1.1 MiB/s:1152.6
                Run:9 Secs:24.876 Gb/s:10.4 Mb/s:10359.2 GiB/s:1.2 MiB/s:1234.9
                Run:10 Secs:25.613 Gb/s:10.1 Mb/s:10061.4 GiB/s:1.2 MiB/s:1199.4
                Overall stats; Throughput Mean:10150.2 Mb/s Throughput Variance:699070.4 Mb/s Duration Mean:25.388 s Duration Variance:0.369 s Peak RSS:276.000 Mb
- Download 30Gb without Disk IO:
                - With thread pinning
                Run:1 Secs:4.746 Gb/s:54.3 Mb/s:54296.1 GiB/s:6.3 MiB/s:6472.6
                Run:2 Secs:3.430 Gb/s:75.1 Mb/s:75140.5 GiB/s:8.7 MiB/s:8957.4
                Run:3 Secs:3.400 Gb/s:75.8 Mb/s:75783.1 GiB/s:8.8 MiB/s:9034.0
                Run:4 Secs:3.020 Gb/s:85.3 Mb/s:85332.2 GiB/s:9.9 MiB/s:10172.4
                Run:5 Secs:3.041 Gb/s:84.7 Mb/s:84739.8 GiB/s:9.9 MiB/s:10101.8
                Run:6 Secs:3.691 Gb/s:69.8 Mb/s:69814.8 GiB/s:8.1 MiB/s:8322.6
                Run:7 Secs:3.187 Gb/s:80.9 Mb/s:80860.2 GiB/s:9.4 MiB/s:9639.3
                Run:8 Secs:3.411 Gb/s:75.5 Mb/s:75541.9 GiB/s:8.8 MiB/s:9005.3
                Run:9 Secs:3.087 Gb/s:83.5 Mb/s:83468.8 GiB/s:9.7 MiB/s:9950.3
                Run:10 Secs:3.243 Gb/s:79.5 Mb/s:79456.7 GiB/s:9.2 MiB/s:9472.0
                Overall stats; Throughput Mean:75224.4 Mb/s Throughput Variance:1104903.1 Mb/s Duration Mean:3.426 s Duration Variance:0.233 s Peak RSS:8138.660 Mb 
                - Without thread pinning
                Run:1 Secs:7.597 Gb/s:33.9 Mb/s:33921.5 GiB/s:3.9 MiB/s:4043.8
                Run:2 Secs:3.500 Gb/s:73.6 Mb/s:73630.6 GiB/s:8.6 MiB/s:8777.5
                Run:3 Secs:3.618 Gb/s:71.2 Mb/s:71226.2 GiB/s:8.3 MiB/s:8490.8
                Run:4 Secs:3.082 Gb/s:83.6 Mb/s:83621.5 GiB/s:9.7 MiB/s:9968.5
                Run:5 Secs:3.184 Gb/s:80.9 Mb/s:80937.0 GiB/s:9.4 MiB/s:9648.4
                Run:6 Secs:3.087 Gb/s:83.5 Mb/s:83484.1 GiB/s:9.7 MiB/s:9952.1
                Run:7 Secs:3.160 Gb/s:81.5 Mb/s:81549.7 GiB/s:9.5 MiB/s:9721.5
                Run:8 Secs:3.013 Gb/s:85.5 Mb/s:85524.6 GiB/s:10.0 MiB/s:10195.3
                Run:9 Secs:3.383 Gb/s:76.2 Mb/s:76184.9 GiB/s:8.9 MiB/s:9081.9
                Run:10 Secs:3.089 Gb/s:83.4 Mb/s:83411.5 GiB/s:9.7 MiB/s:9943.4
                Overall stats; Throughput Mean:70193.7 Mb/s Throughput Variance:147375.3 Mb/s Duration Mean:3.671 s Duration Variance:1.749 s Peak RSS:8145.703 Mb
- Upload 30Gb file
                - With thread pinning
                Run:1 Secs:7.783 Gb/s:33.1 Mb/s:33109.9 GiB/s:3.9 MiB/s:3947.0
                Run:2 Secs:6.043 Gb/s:42.6 Mb/s:42640.9 GiB/s:5.0 MiB/s:5083.2
                Run:3 Secs:5.770 Gb/s:44.7 Mb/s:44659.3 GiB/s:5.2 MiB/s:5323.8
                Run:4 Secs:5.142 Gb/s:50.1 Mb/s:50120.8 GiB/s:5.8 MiB/s:5974.9
                Run:5 Secs:7.155 Gb/s:36.0 Mb/s:36014.0 GiB/s:4.2 MiB/s:4293.2
                Run:6 Secs:5.112 Gb/s:50.4 Mb/s:50410.3 GiB/s:5.9 MiB/s:6009.4
                Run:7 Secs:5.362 Gb/s:48.1 Mb/s:48059.0 GiB/s:5.6 MiB/s:5729.1
                Run:8 Secs:5.057 Gb/s:51.0 Mb/s:50963.5 GiB/s:5.9 MiB/s:6075.3
                Run:9 Secs:5.326 Gb/s:48.4 Mb/s:48386.4 GiB/s:5.6 MiB/s:5768.1
                Run:10 Secs:5.644 Gb/s:45.7 Mb/s:45660.8 GiB/s:5.3 MiB/s:5443.2
                Overall stats; Throughput Mean:44130.8 Mb/s Throughput Variance:334007.1 Mb/s Duration Mean:5.839 s Duration Variance:0.772 s Peak RSS:3898.469 Mb
                - Without thread pinning 
                Run:1 Secs:10.256 Gb/s:25.1 Mb/s:25127.3 GiB/s:2.9 MiB/s:2995.4
                Run:2 Secs:5.711 Gb/s:45.1 Mb/s:45120.9 GiB/s:5.3 MiB/s:5378.8
                Run:3 Secs:5.407 Gb/s:47.7 Mb/s:47657.5 GiB/s:5.5 MiB/s:5681.2
                Run:4 Secs:5.435 Gb/s:47.4 Mb/s:47417.0 GiB/s:5.5 MiB/s:5652.5
                Run:5 Secs:5.281 Gb/s:48.8 Mb/s:48796.5 GiB/s:5.7 MiB/s:5817.0
                Run:6 Secs:5.354 Gb/s:48.1 Mb/s:48127.8 GiB/s:5.6 MiB/s:5737.3
                Run:7 Secs:5.557 Gb/s:46.4 Mb/s:46370.9 GiB/s:5.4 MiB/s:5527.8
                Run:8 Secs:5.354 Gb/s:48.1 Mb/s:48135.4 GiB/s:5.6 MiB/s:5738.2
                Run:9 Secs:5.494 Gb/s:46.9 Mb/s:46903.8 GiB/s:5.5 MiB/s:5591.4
                Run:10 Secs:5.704 Gb/s:45.2 Mb/s:45175.2 GiB/s:5.3 MiB/s:5385.3
                Overall stats; Throughput Mean:43271.3 Mb/s Throughput Variance:124276.9 Mb/s Duration Mean:5.955 s Duration Variance:2.074 s Peak RSS:4081.176 Mb

This PR doesn't address the use case of customers creating a pinned ELG. It simply simplifies the S3 codebase until we have a better reason to integrate CPU group pinning into S3, other than the assumption of performance improvement. I will create another PR which updates the aws_thread_launch API to implement best-effort pinning of the CPU_ID.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov-commenter commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (237e9e1) 89.10% compared to head (ba87de4) 89.10%.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/awslabs/aws-c-s3/pull/403/graphs/tree.svg?width=650&height=150&src=pr&token=J4KP54FVLF&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=awslabs)](https://app.codecov.io/gh/awslabs/aws-c-s3/pull/403?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=awslabs) ```diff @@ Coverage Diff @@ ## main #403 +/- ## ========================================== - Coverage 89.10% 89.10% -0.01% ========================================== Files 21 21 Lines 6169 6167 -2 ========================================== - Hits 5497 5495 -2 Misses 672 672 ``` | [Files](https://app.codecov.io/gh/awslabs/aws-c-s3/pull/403?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=awslabs) | Coverage Δ | | |---|---|---| | [source/s3\_client.c](https://app.codecov.io/gh/awslabs/aws-c-s3/pull/403?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=awslabs#diff-c291cmNlL3MzX2NsaWVudC5j) | `88.35% <100.00%> (+0.07%)` | :arrow_up: | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/awslabs/aws-c-s3/pull/403/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=awslabs)
jamesbornholt commented 7 months ago

could we benchmark on something like c5n.18xlarge that has NUMA and a 100Gbit NIC? that's where I would (naively) expect the pinning effect to show up, if anywhere.

waahm7 commented 7 months ago

@jamesbornholt Thanks, that's a great suggestion. I have added some benchmarks on a c5n.18xlarge and haven't observed any performance degradation.