googleapis / google-cloud-dotnet

Google Cloud Client Libraries for .NET
https://cloud.google.com/dotnet/docs/reference/
Apache License 2.0
941 stars 367 forks source link

Slow Cloud Storage response times (?) #882

Closed markvincze closed 7 years ago

markvincze commented 7 years ago

In an ASP.NET Core application I'm using the Google.Cloud.Storage.V1 SDK to download files from the Cloud Storage. The files I'm downloading are relatively small, their size is 1-2 KB. I'm measuring the time needed for the download, and (even when downloading the same file) the time needed for downloading varies a lot, it's between 50-500 ms, pretty uniformly distributed. My application runs in the Google Container Engine.

This is the code I'm measuring:

using(var ms = new MemoryStream())
{
    var sw = Stopwatch.StartNew();
    await client.DownloadObjectAsync("mybucket", "myfile", ms);
    sw.Stop();

    Log(sw.Elapsed);
}

These are the last couple of measurements I recorded: 75ms, 151ms, 590ms, 61ms, 50ms, 373ms The client is a singleton, so I'm not creating new connections on every request. Are these respose times realistic, or am I doing something wrong?

To me these numbers seem weird especially because in another ASP.NET Core application we're using a Couchbase DB hosted in the Google Compute Engine, and that gives 1-2ms response times when retrieving small documents. Can there be anything trivial that I'm messing up? (So far I only measured my application code, I didn't try to get the SDK source and dig deeper, that would be the next step.)

I also put together a simple benchmark (this is using the non-async version of the method): https://github.com/markvincze/google-storage-benchmark. If I run it on my machine, the average download time for a ~2KB file is 200ms, but it varies widely between 50-300ms. This was one output:

Duration: 229.0578ms
Duration: 272.8529ms
Duration: 62.7628ms
Duration: 50.5129ms
Duration: 140.367ms
Duration: 226.6008ms
Duration: 254.0628ms
Duration: 247.0639ms
Duration: 246.8278ms
Duration: 278.1196ms
Duration: 242.9307ms
Duration: 253.754ms
Duration: 367.8813ms
Duration: 118.1328ms
Duration: 242.9553ms
Duration: 274.4286ms
Duration: 47.6052ms
Duration: 50.5067ms
Duration: 133.8188ms
Duration: 243.6086ms
Duration: 253.1797ms
Duration: 251.5902ms
Duration: 244.3288ms
Duration: 246.5998ms
Duration: 257.1116ms
Duration: 246.9077ms
Duration: 256.8724ms
Duration: 270.5516ms
Duration: 226.6409ms
Duration: 248.6663ms
Duration: 244.1842ms
Duration: 278.6359ms
Duration: 48.6396ms
Duration: 184.4674ms
Duration: 247.6623ms
Duration: 43.555ms
Duration: 52.9093ms
Duration: 158.3365ms
Duration: 53.8388ms
Duration: 189.9398ms
Duration: 244.7559ms
Duration: 248.0005ms
Duration: 257.8644ms
Duration: 238.9399ms
Duration: 52.2841ms
Duration: 230.7102ms
Duration: 218.8651ms
Duration: 250.1323ms
Duration: 259.864ms
Duration: 248.6463ms
Average duration: 204.770616ms

Are these numbers normal, or is something going wrong?

jskeet commented 7 years ago

@markvincze: What storage class are you using, and which location is your bucket hosted in vs where your GCE instance is?

@Capstan You'd know more about this - do these sound reasonable?

markvincze commented 7 years ago

Hi @jskeet, The storage class of the bucket is Multi-regional and its location is EU, and the GCE cluster is in the zone europe-west1-c.

markvincze commented 7 years ago

The numbers seem to vary quite a lot though. I reran the benchmark this morning on my machine (same machine, same network), and now I got these numbers:

Duration: 60.5744ms
Duration: 68.5257ms
Duration: 72.0622ms
Duration: 50.0653ms
Duration: 60.1215ms
Duration: 57.1812ms
Duration: 54.1895ms
Duration: 62.3206ms
Duration: 66.1106ms
Duration: 95.0488ms
Duration: 67.6595ms
Duration: 56.8242ms
Duration: 133.693ms
Duration: 320.0855ms
Duration: 58.4811ms
Duration: 114.9024ms
Duration: 56.3721ms
Duration: 57.5864ms
Duration: 136.8131ms
Duration: 43.0115ms
Duration: 72.7557ms
Duration: 135.8108ms
Duration: 68.1348ms
Duration: 61.0542ms
Duration: 119.098ms
Duration: 60.62ms
Duration: 58.8854ms
Duration: 136.636ms
Duration: 59.3339ms
Duration: 65.721ms
Duration: 118.3037ms
Duration: 55.1682ms
Duration: 49.0644ms
Duration: 70.4776ms
Duration: 75.4073ms
Duration: 65.1596ms
Duration: 65.6835ms
Duration: 52.9039ms
Duration: 64.0012ms
Duration: 60.5316ms
Duration: 49.0398ms
Duration: 145.0981ms
Duration: 57.8898ms
Duration: 191.133ms
Duration: 60.5004ms
Duration: 189.7131ms
Duration: 59.3192ms
Duration: 52.5474ms
Duration: 136.8341ms
Duration: 62.8967ms
Average duration: 84.22702ms

(And by the way—definitely not to start an argument about various providers, but rather just to have a baseline—I created a similar benchmark for the Windows Azure Blob storage (https://github.com/markvincze/azure-blob-benchmark/blob/master/Program.cs), and with that the average download time varies between 10-15ms, both with private and public blobs.)

jskeet commented 7 years ago

I've assigned @Capstan to this as he's from the Storage team - I would only be guessing, basically.

markvincze commented 7 years ago

Hi @Capstan,

Any insight about this would be appreciated, if you have any hints, or if there is anything I could investigate/try. Or if the answer is that these numbers are completely normal and expected, that's really valuable information too.

Thanks, Mark

Capstan commented 7 years ago

@markvincze , has nobody from my team reached out to you yet? :( I will prod again.

markvincze commented 7 years ago

Hi @Capstan,

I don't think so, or at least I haven't seen any emails/messages.

Cheers, Mark

Capstan commented 7 years ago

They got back to me, so I'm relaying this:

[based on the sample provided] p50: 62ms p90: 136ms p99: 191ms or 320ms, depending on how you look at it. (Basically the 49th value is 191ms, the 50th is 320ms)

[This is] in line with what we expect currently. If the user switches to a regional bucket and increases his/her request rate, we'd expect the latency to drop a bit. If the user is able to mark the object as publicly cacheable, we'd expect this to drop to < 20 ms.

markvincze commented 7 years ago

Hi @Capstan,

Thanks for investigating this, I appreciate it!

Cheers, Mark