humanmade / aws-xray

HM Platform AWS X-Ray Integration
23 stars 4 forks source link

Update sampling interval back to 5ms #88

Closed jerico closed 1 year ago

jerico commented 1 year ago

Before we introduced excimer support, we were sampling at 5ms. PR to support excimer changed it to 50ms (see: https://github.com/humanmade/aws-xray/commit/8f8dcfed5b3cec2eef2ad2c157e34272448a193f#diff-a0f318343a79085d556c9f6050c1261041075f5129f00e908fce122d89e02febL24 )

This PR reverts back sampling interval to 5ms.

There's no notable change on the performance.

Making sure local xray is running:

my-project-13-xray  | 2023-01-31T13:01:01Z [Info] Initializing AWS X-Ray daemon 3.3.3
my-project-13-xray  | 2023-01-31T13:01:01Z [Info] Using buffer memory limit of 39 MB
my-project-13-xray  | 2023-01-31T13:01:01Z [Info] 624 segment buffers allocated
my-project-13-xray  | 2023-01-31T13:01:01Z [Info] Using region: us-east-1
my-project-13-xray  | 2023-01-31T13:01:06Z [Error] Get instance id metadata failed: RequestError: send request failed
my-project-13-xray  | caused by: Get "http://169.254.169.254/latest/meta-data/instance-id": dial tcp 169.254.169.254:80: connect: connection refused
my-project-13-xray  | 2023-01-31T13:01:06Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-east-1.amazonaws.com
my-project-13-xray  | 2023-01-31T13:01:06Z [Info] Starting proxy http server on 0.0.0.0:2000
my-project-13-xray  | 2023-01-31T13:04:19Z [Info] Successfully sent batch of 1 segments (1.217 seconds)
---truncated---

Before

Transactions:                500 hits
Availability:             100.00 %
Elapsed time:             435.82 secs
Data transferred:           3.68 MB
Response time:              0.87 secs
Transaction rate:           1.15 trans/sec
Throughput:             0.01 MB/sec
Concurrency:                1.00
Successful transactions:         500
Failed transactions:               0
Longest transaction:            3.54
Shortest transaction:           0.52

After

Transactions:                500 hits
Availability:             100.00 %
Elapsed time:             592.74 secs
Data transferred:           3.68 MB
Response time:              1.19 secs
Transaction rate:           0.84 trans/sec
Throughput:             0.01 MB/sec
Concurrency:                1.00
Successful transactions:         500
Failed transactions:               0
Longest transaction:            5.33
Shortest transaction:           0.58

https://github.com/humanmade/aws-xray/issues/86

rmccue commented 1 year ago

There's no notable change on the performance.

Maybe I'm missing something, but that's not how I'm reading the results; average response time went up from 0.87s to 1.19s, an increase of 0.32s/36%? That seems a pretty big difference though.

jerico commented 1 year ago

There's no notable change on the performance.

Maybe I'm missing something, but that's not how I'm reading the results; average response time went up from 0.87s to 1.19s, an increase of 0.32s/36%? That seems a pretty big difference though.

Sorry I averaged this in my head as "still around 1 second". That's where my comment came from.

Should I also do a test on how the original xhprof was performing? How should we know the threshold of acceptable range?

jerico commented 1 year ago

@rmccue I noticed that when I have another running task, it affects the response time.

I went ahead an re-ran the test. This time, more conscious of what other processes are running. For the following tests, I only have 1 instance of Altis project active, and no other apps aside from macOS system services running.

## 50ms

Transactions:                500 hits
Availability:             100.00 %
Elapsed time:             248.56 secs
Data transferred:           3.68 MB
Response time:              0.50 secs
Transaction rate:           2.01 trans/sec
Throughput:             0.01 MB/sec
Concurrency:                1.00
Successful transactions:         500
Failed transactions:               0
Longest transaction:            1.32
Shortest transaction:           0.41

## 5ms

Run #1

Transactions:                500 hits
Availability:             100.00 %
Elapsed time:             253.08 secs
Data transferred:           3.68 MB
Response time:              0.51 secs
Transaction rate:           1.98 trans/sec
Throughput:             0.01 MB/sec
Concurrency:                1.00
Successful transactions:         500
Failed transactions:               0
Longest transaction:            1.82
Shortest transaction:           0.42

Run #2

Transactions:                500 hits
Availability:             100.00 %
Elapsed time:             257.79 secs
Data transferred:           3.68 MB
Response time:              0.52 secs
Transaction rate:           1.94 trans/sec
Throughput:             0.01 MB/sec
Concurrency:                1.00
Successful transactions:         500
Failed transactions:               0
Longest transaction:            1.93
Shortest transaction:           0.42
rmccue commented 1 year ago

Cool, that looks more like what I'd expect, no real impact to response times.

jerico commented 1 year ago

Merging despite linter check failing as there's an existing PR to fix those: https://github.com/humanmade/aws-xray/pull/79