anjor commented 1 year ago

Proposal: Estuary performance testing

Author	Anjor
Status	Draft
Revision

This is a WIP

Proposal/Overview

We should have metrics on estuary's data onboarding performance. We should be able to answer questions such as

What is the data throughput? How does it scale with increasing data size? Is there a sweet spot?
What is the maximum size estuary can handle?

The current plan is to set up datasets in increasing sizes ranging from 1GB up to 1TB and measure data onboarding performance.

Technical Design

The performance testing will be carried out on an equinix box. We will download public datasets ranging in sizes from 1GB up to 1TB and try uploading them to estuary.

Known problems

Files larger than 32GB might have issues. Once the endpoint is unable to handle the upload, we will attempt using different preparation tools such as barge and singularity.

anjor commented 1 year ago

The end goal here is to have a full end to end data onboarding story fleshed out.

anjor commented 1 year ago

Some initial results.

Size (in GB)	Time (in seconds): attempt 1	Time (in seconds): attempt 2	Time (in seconds): attempt 3	Average time
1.8	47	41	41	43
3.6	91	86	98	91.66666667
5.4	142	149	130	140.3333333
7.2	173	162	176	170.3333333
9	215	221	223	219.6666667
18	439	415	429	427.6666667
27	655	617	664	645.3333333

Estuary performance

The above test was carried out using a c3.small.x86 server in the Silicon Valley region of equinix metal. Uploads were tested against shuttle-4 due to proximity of location (shuttle-1 had content adding disabled).

anjor commented 1 year ago

Results for shuttle 7

Size (in GB)	Time (in seconds): attempt 1	Time (in seconds): attempt 2	Time (in seconds): attempt 3	Average time
1.8	104	115	113	110.6666667
3.6	207	202	241	216.6666667
5.4	314	335	320	323
7.2	443	417	494	451.3333333
9	550	522	480	517.3333333
18	1054	1056	1103	1071
27	1569	1441	1955	1655

Estuary performance - shuttle 7

The above test was carried out using a c3.small.x86 server in the Dallas region of equinix metal. Uploads were tested against shuttle-7 due to proximity of location.

application-research / outercore-eng-kb

Estuary performance testing #8

Proposal: Estuary performance testing

Proposal/Overview

Technical Design

Known problems