cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.18k stars 3.82k forks source link

cloud/gcp: reading from gcs is slow #124767

Open dt opened 6 months ago

dt commented 6 months ago

The new benchmark added in https://github.com/cockroachdb/cockroach/pull/124744 shows that we're much slower at reading bytes from a file on GCS than a file on S3. We should dig into our wrapper, the SDK client settings and the buffer sizes to figure out the GCS performance is so much worse than the s3 performance:

ObjStorageCopyGCS/size=8.0_MiB-4   43.02 MiB/s
ObjStorageCopyS3/size=8.0_MiB-4    80.89 MiB/s

ObjStorageCopyGCS/size=32_MiB-4    45.09 MiB/s
ObjStorageCopyS3/size=32_MiB-4     88.40 MiB/s

ObjStorageCopyGCS/size=64_MiB-4    46.44 MiB/s
ObjStorageCopyS3/size=64_MiB-4     89.90 MiB/s

Jira issue: CRDB-39069

Epic CRDB-40359

blathers-crl[bot] commented 6 months ago

Hi @dt, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

msbutler commented 5 months ago

@itsbilal one thing to note: we've been meaning to update gcs client library. Perhaps this could help. I have no idea though.

jeffswenson commented 1 month ago

@dt when you ran this test, where was your client located?

jeffswenson commented 1 month ago

Looking through recent changes to the GCS package, the direct connect optimizations may be relevant to DR and disaggregated storage. There appears to be a GRPC client and using it lets ops bypass one layer of proxies.

https://github.com/googleapis/google-cloud-go/pull/10859/files

dt commented 1 month ago

I don't recall where/how I ran this originally. I just re-ran this from a MacBook with various vpn/traffic routing tools disabled from the nyc office using ./dev bench -v --stream-output --count 6 --bench-time=1x ./pkg/storage -f BenchmarkObjStorage --timeout 15m -- --test_env=COCKROACH_BENCHMARK_REMOTE_SSTS=1 | tee bench.txt and got terrible (~3mb/s) from both.

jeffswenson commented 1 month ago

Okay, I'll see if there's a way to run this on roachprod vms in the same cloud region as each of the buckets. That's the only fair way I can think to run this.

dt commented 1 month ago

Being lazy I just ran it again on my laptop (and you could claim the is fair-ish since then it isn't directly in either's region, and is what, say, a node in a datacenter in Secaucus might expect if it backed up to s3/gcs in us-east). One thing I noticed that the test is slightly flawed in that it does not call rh.SetupForCompaction() on the readHandle. This, combined with Copy() passing a mere 256kb buffer, means that until readahead kicks in we're doing some pretty small reads. If I do my own "download" loop over ReadAt+discard with different sized buffers, side-by-side with objstorage.Copy(), we see that reading into a 4mb/8mn/16mb buffer to discard greatly enhances the throughput of both SDKs:

GCS/raw_buf=1.0_MiB/size=4.0_KiB   39.06KB/s        S3/raw_buf=1.0_MiB/size=4.0_KiB    78.12KB/s
GCS/raw_buf=1.0_MiB/size=64_KiB    722.7KB/s        S3/raw_buf=1.0_MiB/size=64_KiB     1.440MB/s
GCS/raw_buf=1.0_MiB/size=1.0_MiB   10.68MB/s        S3/raw_buf=1.0_MiB/size=1.0_MiB    24.61MB/s
GCS/raw_buf=1.0_MiB/size=8.0_MiB   8.221MB/s        S3/raw_buf=1.0_MiB/size=8.0_MiB    16.75MB/s
GCS/raw_buf=1.0_MiB/size=32_MiB    5.617MB/s        S3/raw_buf=1.0_MiB/size=32_MiB     12.75MB/s
GCS/raw_buf=1.0_MiB/size=64_MiB    7.505MB/s        S3/raw_buf=1.0_MiB/size=64_MiB     11.80MB/s

GCS/raw_buf=4.0_MiB/size=4.0_KiB   48.83KB/s        S3/raw_buf=4.0_MiB/size=4.0_KiB    87.89KB/s
GCS/raw_buf=4.0_MiB/size=64_KiB    752.0KB/s        S3/raw_buf=4.0_MiB/size=64_KiB     1.364MB/s
GCS/raw_buf=4.0_MiB/size=1.0_MiB   11.42MB/s        S3/raw_buf=4.0_MiB/size=1.0_MiB    22.75MB/s
GCS/raw_buf=4.0_MiB/size=8.0_MiB   24.23MB/s        S3/raw_buf=4.0_MiB/size=8.0_MiB    31.59MB/s
GCS/raw_buf=4.0_MiB/size=32_MiB    20.02MB/s        S3/raw_buf=4.0_MiB/size=32_MiB     27.06MB/s
GCS/raw_buf=4.0_MiB/size=64_MiB    19.78MB/s        S3/raw_buf=4.0_MiB/size=64_MiB     27.16MB/s

GCS/raw_buf=8.0_MiB/size=4.0_KiB   39.06KB/s        S3/raw_buf=8.0_MiB/size=4.0_KiB    87.89KB/s
GCS/raw_buf=8.0_MiB/size=64_KiB    703.1KB/s        S3/raw_buf=8.0_MiB/size=64_KiB     1.411MB/s
GCS/raw_buf=8.0_MiB/size=1.0_MiB   10.94MB/s        S3/raw_buf=8.0_MiB/size=1.0_MiB    22.79MB/s
GCS/raw_buf=8.0_MiB/size=8.0_MiB   28.07MB/s        S3/raw_buf=8.0_MiB/size=8.0_MiB    36.17MB/s
GCS/raw_buf=8.0_MiB/size=32_MiB    26.95MB/s        S3/raw_buf=8.0_MiB/size=32_MiB     21.695MB/s
GCS/raw_buf=8.0_MiB/size=64_MiB    27.57MB/s        S3/raw_buf=8.0_MiB/size=64_MiB     25.77MB/s

GCS/raw_buf=16_MiB/size=4.0_KiB    48.83KB/s        S3/raw_buf=16_MiB/size=4.0_KiB     78.12KB/s
GCS/raw_buf=16_MiB/size=64_KiB     830.1KB/s        S3/raw_buf=16_MiB/size=64_KiB      1.564MB/s
GCS/raw_buf=16_MiB/size=1.0_MiB    13.17MB/s        S3/raw_buf=16_MiB/size=1.0_MiB     21.32MB/s
GCS/raw_buf=16_MiB/size=8.0_MiB    30.45MB/s        S3/raw_buf=16_MiB/size=8.0_MiB     41.66MB/s
GCS/raw_buf=16_MiB/size=32_MiB     35.93MB/s        S3/raw_buf=16_MiB/size=32_MiB      35.42MB/s
GCS/raw_buf=16_MiB/size=64_MiB     37.62MB/s        S3/raw_buf=16_MiB/size=64_MiB      37.41MB/s

GCS/raw_buf=32_MiB/size=4.0_KiB    48.83KB/s        S3/raw_buf=32_MiB/size=4.0_KiB     87.89KB/s
GCS/raw_buf=32_MiB/size=64_KiB     800.8KB/s        S3/raw_buf=32_MiB/size=64_KiB      1.507MB/s
GCS/raw_buf=32_MiB/size=1.0_MiB    11.37MB/s        S3/raw_buf=32_MiB/size=1.0_MiB     23.26MB/s
GCS/raw_buf=32_MiB/size=8.0_MiB    30.24MB/s        S3/raw_buf=32_MiB/size=8.0_MiB     40.11MB/s
GCS/raw_buf=32_MiB/size=32_MiB     47.76MB/s        S3/raw_buf=32_MiB/size=32_MiB      38.41MB/s
GCS/raw_buf=32_MiB/size=64_MiB     44.62MB/s        S3/raw_buf=32_MiB/size=64_MiB      34.47MB/s

Adding a rh.SetupForCompaction() call to the existing bench directly before objstorage.Copy brings the results up to roughly equal around 25-27mb/s for the 32mb sst:


ObjStorageCopyGCS/objstorageCopy/size=4.0_KiB     9.766Ki ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=4.0_KiB      9.766Ki ± ∞ ¹
ObjStorageCopyGCS/objstorageCopy/size=64_KiB      195.3Ki ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=64_KiB       244.1Ki ± ∞ ¹
ObjStorageCopyGCS/objstorageCopy/size=1.0_MiB     2.737Mi ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=1.0_MiB      3.357Mi ± ∞ ¹
ObjStorageCopyGCS/objstorageCopy/size=8.0_MiB     22.95Mi ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=8.0_MiB      25.17Mi ± ∞ ¹
ObjStorageCopyGCS/objstorageCopy/size=32_MiB      27.68Mi ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=32_MiB       29.42Mi ± ∞ ¹
ObjStorageCopyGCS/objstorageCopy/size=64_MiB      28.00Mi ± ∞ ¹         ObjStorageCopyS3/objstorageCopy/size=64_MiB       29.77Mi ± ∞ ¹