Open kiwicopple opened 3 years ago
Some initial results using k6 for a simple read query of the users table with just 10 records. Throughput is 324 req/second when run across 10 parallel threads for 30 seconds!
Run from my mac locally. Need to run it on a less noisier server for more accurate numbers.
k6 seems like a good fit if we are pinging the API endpoints directly. If we are to test the functionality of supabase-js like realtime subscriptions, k6 might not be a good fit. Bundling in existing node_modules is not as straightforward as just importing it. But it can be done with some changes to the webpack config in supabase-js.
initial results:
Some results using the get all: http.get(${supabaseUrl}/rest/v1/read?select=*
, params) and write http.post(${supabaseUrl}/rest/v1/write
, JSON.stringify(body), params)
EC2 vCPU credit mode is Standard unless otherwise stated
t3a with 10 VUs over 30 seconds
micro db & micro kps
1 2 3
Read 1138/s 1163/s 1162/s
Write 940/s 806/s 811/s
nano db & nano kps
1 2 3
Read 46/s 44/s 48/s
Write 42/s 41/s 42/s
micro db & nano kps
1 2 3
Read 26/s 1/s 33/s
Write 3/s 1/s 28/s
nano db & micro kps
1 2 3
Read 53/s 53/s 44/s
Write 38/s 50/s 38/s
medium db & medium kps *Unlimited Mode*
1 2 3
Read 1049/s 1064/s 1065/s
Write 746/s 772/s 735/s
2xlarge db & 2xlarge kps
1 2 3
Read 963/s 994/s 950/s
Write 720/s 721/s 686/s
**t3a with 32 VUs over 30 seconds**
2xlarge db & 2xlarge kps
1 2 3
Read 722/s 862/s 829/s
Write 428/s 594/s 605/s
The nano scores for get all seem in line with what @steve-chavez achieved with t3a.nano
t3a.2xlarge and more threads on k6 didn't seem to have an effect, I did notice that my CPU was maxing out at 50% for the k6 process, so parallel processes maybe required to do more damage
I will try again tomorrow with single inserts and single reads instead of select=*
2xlarge Standard Mode
I ran k6 across two processes against 2xlarge instances and the throughput remained the same overall (actually less than previously at 701/s reads)- just to confirm that our k6 setup (# of VUs) wasn't the limiting factor
also repeated the original 2xlarge run against a new set of instances and saw the same results as last night
VUs
I experimented more with the VUs and there doesn't seem to be a difference between 10 and 50 for the current setup
single select vs select all
I also changed the select to a single column/row get read?select=id&id=eq.blah and there was no improvement over read?select=*
just been running our k6 script against the equivalent Firestore REST API -
Singapore datacentre
got 1540/s for a read / GET all (with one document)
you have to POST to do filtering, but they immediately started throttling me when I switched so hard to get a true comparison. Will see if I can run it again later tomorrow
on review it appears that I maxed out my free quota for today:
Stored data 1 GB total
Bandwidth 10GB/month
Document writes 20K/day
Document reads 50K/day <------ here
Document deletes 20K/day
1 million rows/documents with random reads by equality on indexed column. 30s 10 VUs.
Firestore
requests: 344/s http_req_duration: avg=28.6ms min=20.82ms med=26.48ms max=134.33ms
c4.xlarge
Supabase direct to IP
requests: 783/s http_req_duration: avg=12.52ms min=4.52ms med=10.47ms max=60.02ms
Supabase via DNS proxy
requests: 434/s http_req_duration: avg=22.67ms min=11.02ms med=19.83ms max=431.91ms
Supabase via DNS no proxy (no https)
requests: 775/s http_req_duration: avg=12.67ms min=4.54ms med=10.67ms max=60.29ms
t3a.micro Unlimited
Supabase direct to IP
requests: 1085/s http_req_duration: avg=9.05ms min=4.78ms med=8.65ms max=42.82ms
t3a.micro Standard
Supabase direct to IP
requests: 709/s http_req_duration: avg=13.9ms min=4.9ms med=8.76ms max=200.8ms
Write rows/documents with random ids/values on indexed column. 30s 10 VUs.
Firestore
requests: 259/s http_req_duration: avg=38.11ms min=24.59ms med=38.78ms max=350.67ms
c4.xlarge
Supabase direct to IP
requests: 649/s http_req_duration: avg=15.1ms min=5.3ms med=12.84ms max=57.79ms
Supabase via DNS proxy
requests: 411/s http_req_duration: avg=23.82ms min=11.9ms med=21.37ms max=270.74ms
Supabase via DNS no proxy (no https)
requests: 603/s http_req_duration: avg=16.33ms min=5.34ms med=12.93ms max=203.77ms
t3a.micro Unlimited
Supabase direct to IP
requests: 856/s http_req_duration: avg=11.45ms min=5.49ms med=10.7ms max=71.75ms
t3a.micro Standard
Supabase direct to IP
requests: 905/s http_req_duration: avg=10.82ms min=5.57ms med=10.32ms max=58.59ms
noticed that after inserting 1 million rows that supabase (beam) process was using ~100% CPU and ~50% memory for quite some minutes afterwards, it makes sense, just thought it was worth highlighting that it's a limitation of free tier processing millions of rows per minute, although quite an extreme use case for free tier
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1899 supabase 20 0 2662604 479564 2520 S 99.7 48.9 5:49.22 beam.smp
we can also constrain each application to a set CPU limit on KPS I believe, may be worth looking into
I was only getting ~50 reads per second whilst realtime was hogging the CPU/RAM, once I killed it I was back to:
data_received..............: 2.7 MB 90 kB/s
data_sent..................: 3.2 MB 108 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 7149
http_req_blocked...........: avg=219.45µs min=0s med=2µs max=94.42ms p(90)=5µs p(95)=7µs
http_req_connecting........: avg=215.21µs min=0s med=0s max=91.91ms p(90)=0s p(95)=0s
✗ http_req_duration..........: avg=41.69ms min=4.81ms med=9.42ms max=600.83ms p(90)=97.6ms p(95)=99.52ms
http_req_receiving.........: avg=47.2µs min=13µs med=40µs max=1.16ms p(90)=80µs p(95)=99µs
http_req_sending...........: avg=20.03µs min=5µs med=16µs max=3.37ms p(90)=32µs p(95)=42µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=41.63ms min=4.72ms med=9.36ms max=600.75ms p(90)=97.49ms p(95)=99.42ms
http_reqs..................: 7149 237.589165/s
iteration_duration.........: avg=42.02ms min=1.26µs med=9.57ms max=600.97ms p(90)=97.86ms p(95)=99.81ms
iterations.................: 7149 237.589165/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
however it's still less than what I was seeing previously for this test:
currently working off branch: https://github.com/supabase/benchmarks/pull/13
@steve-chavez have you benchmarked postgrest (nightly) recently. We're current on https://github.com/PostgREST/postgrest/releases/download/nightly/postgrest-nightly-2020-11-10-22-47-3830887-linux-x64-static.tar.xz
@awalias Hmm.. weird. If anything there should be an increase in performance with the nightly because it reduced verbose logging(writing to disk without buffering).
I've just ran the GETSingle load test on t3a.nano with that nightly - recreating the VPC and all - and the improvement holds over v7.0.1
:
data_received..............: 13 MB 444 kB/s
data_sent..................: 4.9 MB 163 kB/s
dropped_iterations.........: 360 11.977949/s
✓ failed requests............: 0.00% ✓ 0 ✗ 47641
http_req_blocked...........: avg=8.48µs min=1.61µs med=3.58µs max=10.55ms p(90)=6.59µs p(95)=26.23µs
http_req_connecting........: avg=1.22µs min=0s med=0s max=10.3ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=5.64ms min=850.74µs med=1.79ms max=190.7ms p(90)=10.85ms p(95)=26.43ms
http_req_receiving.........: avg=124.99µs min=12.52µs med=37.65µs max=65.94ms p(90)=94.6µs p(95)=204.12µs
http_req_sending...........: avg=305.86µs min=7.23µs med=27.89µs max=68.91ms p(90)=63.14µs p(95)=211.64µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=5.21ms min=781.43µs med=1.68ms max=148.21ms p(90)=10.19ms p(95)=23.39ms
http_reqs..................: 47641 1585.115189/s
iteration_duration.........: avg=5.82ms min=937.5µs med=1.96ms max=190.96ms p(90)=11.11ms p(95)=26.77ms
iterations.................: 47641 1585.115189/s
vus........................: 111 min=104 max=111
vus_max....................: 111 min=104 max=111
1585.115189/s
vs the previous 1372.484403/s
.
Granted, the chinook db doesn't have a million rows. I'm going to try and see if there's something else going on. First off, I'll change the load tests to use randomized id searches.
Edit: randomized the ids in the postgrest GET load tests(https://github.com/supabase/benchmarks/pull/14). No significant changes in the results.
@awalias I ran load tests with your 1 mil rows schema
(On the t3a.nano, GETSingle, change to the postgrest benchmark setup here):
data_received..............: 12 MB 388 kB/s
data_sent..................: 4.1 MB 138 kB/s
dropped_iterations.........: 731 24.317469/s
✓ failed requests............: 0.00% ✓ 0 ✗ 41870
http_req_blocked...........: avg=9.6µs min=1.62µs med=3.69µs max=22.46ms p(90)=5.46µs p(95)=11.53µs
http_req_connecting........: avg=3.63µs min=0s med=0s max=22.2ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=46.64ms min=991.33µs med=44.48ms max=170.51ms p(90)=90.08ms p(95)=102.66ms
http_req_receiving.........: avg=125.3µs min=12.16µs med=35.19µs max=35.41ms p(90)=87.29µs p(95)=251.94µs
http_req_sending...........: avg=291µs min=7.66µs med=28.92µs max=42.08ms p(90)=63.83µs p(95)=232.98µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=46.22ms min=918.7µs med=44.24ms max=170.04ms p(90)=89.53ms p(95)=101.48ms
http_reqs..................: 41870 1392.848744/s
iteration_duration.........: avg=46.82ms min=1.12ms med=44.66ms max=170.72ms p(90)=90.26ms p(95)=102.93ms
iterations.................: 41870 1392.848744/s
vus........................: 175 min=110 max=175
vus_max....................: 175 min=110 max=175
1392.848744/s
, similar numbers I got for the chinook db. Which shows that the number of rows shouldn't make a difference as long the searched id is indexed.
data_received..............: 14 MB 479 kB/s
data_sent..................: 5.3 MB 176 kB/s
dropped_iterations.........: 431 14.343359/s
✓ failed requests............: 0.00% ✓ 0 ✗ 53569
http_req_blocked...........: avg=7.78µs min=1.23µs med=3.42µs max=59.1ms p(90)=5.01µs p(95)=8.36µs
http_req_connecting........: avg=1.99µs min=0s med=0s max=30.42ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=9.55ms min=837.17µs med=2.43ms max=177.99ms p(90)=29.61ms p(95)=50.49ms
http_req_receiving.........: avg=77.4µs min=12.96µs med=35.66µs max=26.07ms p(90)=84.76µs p(95)=195.58µs
http_req_sending...........: avg=171.93µs min=7.44µs med=28.06µs max=36.49ms p(90)=56.43µs p(95)=118.24µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=9.3ms min=789.53µs med=2.35ms max=177.84ms p(90)=28.54ms p(95)=49.8ms
http_reqs..................: 53569 1782.736413/s
iteration_duration.........: avg=9.7ms min=937.84µs med=2.57ms max=178.6ms p(90)=29.79ms p(95)=50.69ms
iterations.................: 53569 1782.736413/s
vus........................: 123 min=104 max=123
vus_max....................: 123 min=104 max=123
1782.736413/s
. Shows that the logging improvement also holds on the read
schema.
Could the problem be somewhere else? Or perhaps the reads you ran were not filtered?
Think I've figured out the issue with reduced performance: Cloudflare proxy performance has reduced dramatically
$ top
on postgres
$ top
on kps
data_received..............: 1.4 MB 46 kB/s
data_sent..................: 360 kB 12 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 8894
http_req_blocked...........: avg=41.35µs min=0s med=1µs max=142.42ms p(90)=1µs p(95)=2µs
http_req_connecting........: avg=7.02µs min=0s med=0s max=9.87ms p(90)=0s p(95)=0s
✗ http_req_duration..........: avg=33.5ms min=16.33ms med=26.94ms max=986.82ms p(90)=45.47ms p(95)=62.8ms
http_req_receiving.........: avg=325.79µs min=15µs med=250µs max=70.99ms p(90)=607.6µs p(95)=781.29µs
http_req_sending...........: avg=65.86µs min=16µs med=59µs max=319µs p(90)=100.6µs p(95)=114µs
http_req_tls_handshaking...: avg=32.92µs min=0s med=0s max=131.26ms p(90)=0s p(95)=0s
http_req_waiting...........: avg=33.1ms min=15.79ms med=26.55ms max=986.32ms p(90)=45.09ms p(95)=62.26ms
http_reqs..................: 8895 292.35869/s
iteration_duration.........: avg=33.77ms min=16.56ms med=27.16ms max=1s p(90)=45.68ms p(95)=63.07ms
iterations.................: 8894 292.325823/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 12 MB 414 kB/s
data_sent..................: 15 MB 504 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 33930
http_req_blocked...........: avg=59.83µs min=0s med=2µs max=176.3ms p(90)=3µs p(95)=5µs
http_req_connecting........: avg=56.85µs min=0s med=0s max=176.22ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=8.69ms min=4.46ms med=7.45ms max=176.79ms p(90)=10.17ms p(95)=11.47ms
http_req_receiving.........: avg=39.47µs min=14µs med=36µs max=1.55ms p(90)=55µs p(95)=67µs
http_req_sending...........: avg=16.38µs min=5µs med=14µs max=604µs p(90)=23µs p(95)=30µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=8.63ms min=4.41ms med=7.39ms max=176.62ms p(90)=10.11ms p(95)=11.4ms
http_reqs..................: 33931 1129.164115/s
iteration_duration.........: avg=8.83ms min=4.53ms med=7.54ms max=194.58ms p(90)=10.4ms p(95)=11.81ms
iterations.................: 33930 1129.130837/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 4.9 MB 160 kB/s
data_sent..................: 503 kB 17 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 12360
http_req_blocked...........: avg=26.75µs min=0s med=1µs max=142.13ms p(90)=1µs p(95)=1µs
http_req_connecting........: avg=5.7µs min=0s med=0s max=10.04ms p(90)=0s p(95)=0s
✗ http_req_duration..........: avg=24.07ms min=10.92ms med=18.92ms max=481.97ms p(90)=33.15ms p(95)=47.27ms
http_req_receiving.........: avg=306.08µs min=17µs med=217µs max=171.31ms p(90)=620µs p(95)=802µs
http_req_sending...........: avg=55.82µs min=14µs med=50µs max=494µs p(90)=86µs p(95)=100µs
http_req_tls_handshaking...: avg=20.09µs min=0s med=0s max=131.26ms p(90)=0s p(95)=0s
http_req_waiting...........: avg=23.71ms min=10.6ms med=18.58ms max=481.28ms p(90)=32.79ms p(95)=46.74ms
http_reqs..................: 12361 407.226379/s
iteration_duration.........: avg=24.29ms min=11.14ms med=19.1ms max=482.16ms p(90)=33.34ms p(95)=47.66ms
iterations.................: 12360 407.193434/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 14 MB 448 kB/s
data_sent..................: 17 MB 573 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 36742
http_req_blocked...........: avg=46.8µs min=0s med=2µs max=26.69ms p(90)=3µs p(95)=4µs
http_req_connecting........: avg=43.2µs min=0s med=0s max=19.48ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=8.03ms min=4.56ms med=7.66ms max=64.81ms p(90)=10.25ms p(95)=11.35ms
http_req_receiving.........: avg=38.79µs min=12µs med=36µs max=914µs p(90)=53µs p(95)=64µs
http_req_sending...........: avg=16.39µs min=5µs med=14µs max=493µs p(90)=23µs p(95)=28µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=7.97ms min=4.51ms med=7.61ms max=64.77ms p(90)=10.18ms p(95)=11.29ms
http_reqs..................: 36743 1221.9875/s
iteration_duration.........: avg=8.16ms min=4.62ms med=7.76ms max=64.88ms p(90)=10.47ms p(95)=11.64ms
iterations.................: 36742 1221.954243/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 16 MB 547 kB/s
data_sent..................: 13 MB 424 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 44766
http_req_blocked...........: avg=46.85µs min=0s med=2µs max=21.42ms p(90)=3µs p(95)=4µs
http_req_connecting........: avg=44.05µs min=0s med=0s max=13ms p(90)=0s p(95)=0s
✓ http_req_duration..........: avg=6.58ms min=4.02ms med=6.28ms max=213.2ms p(90)=8.17ms p(95)=8.93ms
http_req_receiving.........: avg=31.55µs min=11µs med=28µs max=998µs p(90)=47µs p(95)=57µs
http_req_sending...........: avg=12.63µs min=4µs med=11µs max=487µs p(90)=18µs p(95)=23µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=6.53ms min=3.98ms med=6.24ms max=213.16ms p(90)=8.12ms p(95)=8.88ms
http_reqs..................: 44767 1489.492915/s
iteration_duration.........: avg=6.69ms min=4.1ms med=6.36ms max=213.27ms p(90)=8.33ms p(95)=9.23ms
iterations.................: 44766 1489.459643/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 264 kB 8.5 kB/s
data_sent..................: 70 kB 2.3 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 1306
http_req_blocked...........: avg=209.3µs min=290ns med=420ns max=164.07ms p(90)=520ns p(95)=560ns
http_req_connecting........: avg=15.13µs min=0s med=0s max=2.4ms p(90)=0s p(95)=0s
✗ http_req_duration..........: avg=230.49ms min=195.79ms med=203.93ms max=597ms p(90)=387.16ms p(95)=393.61ms
http_req_receiving.........: avg=63.54µs min=18.98µs med=54.9µs max=1.83ms p(90)=79.74µs p(95)=92.03µs
http_req_sending...........: avg=36.96µs min=21.01µs med=32.14µs max=295.53µs p(90)=52.63µs p(95)=57.88µs
http_req_tls_handshaking...: avg=188.44µs min=0s med=0s max=156.57ms p(90)=0s p(95)=0s
http_req_waiting...........: avg=230.38ms min=195.71ms med=203.84ms max=596.82ms p(90)=387.07ms p(95)=393.43ms
http_reqs..................: 1307 42.336073/s
iteration_duration.........: avg=230.84ms min=195.89ms med=204.08ms max=605.34ms p(90)=387.32ms p(95)=393.77ms
iterations.................: 1306 42.303681/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
data_received..............: 578 kB 19 kB/s
data_sent..................: 738 kB 24 kB/s
✓ failed requests............: 0.00% ✓ 0 ✗ 1573
http_req_blocked...........: avg=2.48ms min=1.12µs med=2.27µs max=188.39ms p(90)=3.46µs p(95)=6.06µs
http_req_connecting........: avg=2.47ms min=0s med=0s max=187.7ms p(90)=0s p(95)=0s
✗ http_req_duration..........: avg=188.56ms min=185.99ms med=188.35ms max=211.07ms p(90)=190ms p(95)=190.52ms
http_req_receiving.........: avg=48.11µs min=19.47µs med=45.13µs max=238.32µs p(90)=67.07µs p(95)=81.16µs
http_req_sending...........: avg=21.33µs min=7.59µs med=14.92µs max=238.94µs p(90)=36.9µs p(95)=42.05µs
http_req_tls_handshaking...: avg=0s min=0s med=0s max=0s p(90)=0s p(95)=0s
http_req_waiting...........: avg=188.49ms min=185.92ms med=188.29ms max=211.01ms p(90)=189.93ms p(95)=190.42ms
http_reqs..................: 1574 51.413082/s
iteration_duration.........: avg=191.15ms min=186.11ms med=188.5ms max=380.77ms p(90)=190.15ms p(95)=190.81ms
iterations.................: 1573 51.380418/s
vus........................: 10 min=10 max=10
vus_max....................: 10 min=10 max=10
wow, that's a crazy difference. Our customer boxes aren't running on Unlimited though right?
without logflare worker
Is the logflare worker actually a culprit? I'm guessing as soon as you turn off the proxy, the worker stops too?
Unlimited gets turned off a bit after instance startup.
Logflare worker is "just" costing us 115 req / sec. Major impact seems to be enabling cloudflare.
Chore
As part of our move from Alpha to Beta, we will want to be clear about the limitations and performance of each of the components in Supabase.
In this benchmark, we want to load test the full system including Middleware + Postgres.
We need to do 3 things: