Azure / azure-cosmos-dotnet-v2

Contains samples and utilities relating to the Azure Cosmos DB .NET SDK
MIT License
577 stars 837 forks source link

Reading non-existent documents while debugging is very slow #583

Open Liversage opened 6 years ago

Liversage commented 6 years ago

Trying to read a document from Cosmos DB that doesn't exist is very slow when executing in the Visual Studio debugger. In fact, it is so slow that the project I'm working on can have startup times of several minutes when I debug which severly impacts my productivity.

I believe that the cause of the issue is that trying to read a document that doesn't exist will throw an DocumentClientException with status HttpStatusCode.NotFound. In itself that shouldn't be a problem but it seems that SDK will catch and rethrow this exception through many layers of abstractions and this cripples the performance when debugging in Visual Studio. At least, that is my best guess.

To give some perspective why I have this issue: I'm building an actor based system where the state of the actors are backed by Cosmos DB. The performance guarantees of Cosmos DB works really well in this scenario. When the system starts it receives data from an external API and builds an actor model of several hundred actors. Receiving this data takes a few seconds but the actors have to try to read previous state from Cosmos DB and then update and save the new state back to Cosmos DB before the system is ready.

In production the startup speed of the system is fast. However, when debugging using an empty Cosmos DB database starting the service becomes extremely slow because all the initial reads will fail.

To demonstrate the issue I have created a small test program that simulates an actor based system where each actor will try to read state and then upsert new state. On first pass the read will fail while on second pass the documents already exists and the read doesn't fail. The test uses the emulator and reports the time taken to read and upsert the documents. The test application uses Microsoft.Azure.DocumentDB.Core 1.9.1 and processes 500 documents.

The Visual Studio debugger has an Enable Just My Code setting and turning this setting off (which is a legitimate thing to do) will lower the debugging performance from really bad to disastrous. One effect of turning this setting off is that every time an exception is rethrown it is logged to the output window and you can see how each failed read produces vastly more output. You can configure the output window to not show exceptions but unfortunately this only improves performance slightly.

This issue is about the performance of using the SDK in Visual Studio debugger. It is not about the performance of Cosmos DB. Thus all my "benchmarks" are executed using a debug build on my pretty beefy i7-6700K desktop computer. IntelliTrace was turned off for all tests. The throughput of the document collection is set to the maximum value of 10000 to ensure that throttling doesn't interfere with the benchmark.

Debugger Just My Code 1st pass 2nd pass
N N/A 1s / 2s 1s / 2s
Y Y 22s / 3s 1s / 2s
Y N 5m 16s / 3s 1s / 2s

The two durations listed in the 1st pass and 2nd pass columns are the times to read and upsert the documents.

The first line in the table is when doing the benchmark outside the debugger (it is still a debug build that executes). The performance is the same for both passes and is fine.

The second line in the table is doing the benchmark inside the debugger with Enable Just My Code turned on. The performance of the first pass is already really bad going from 1s to 22s for reading 500 non-existent documents.

The third line is the Enable Just My Code turned off and now it takes more then five minutes to read 500 non-existent documents! This is less than 100 documents per minute.

While the "production" performance of the SDK is very important I think you also should consider the debug performance. Developer time is a limited resource and I find that I'm wasting minutes that accumulate to hours and days over time waiting for the SDK to report that a document wasn't found while debugging.

If you want to reproduce my findings I have created a GitHub gist with my test application.

christopheranderson commented 6 years ago

I'm assigning this to @kirankumarkolli to provide any further context.

This will be resolved when we release the next major version of the .NET SDK which completely overhauls the exception pipeline and realizes some massive perf improvements because of it.

I'll leave this open until that version is released. @kirankumarkolli can then close it.