Elfocrash / Cosmonaut

🌐 A supercharged Azure CosmosDB .NET SDK with ORM support
https://cosmonaut.readthedocs.io
MIT License
342 stars 44 forks source link

Cost of FirstOrDefaultAsync #44

Closed rhalaly closed 5 years ago

rhalaly commented 5 years ago

I think that your implementation of FirstOrDefaultAsync is not cost-optimal.

In this line: https://github.com/Elfocrash/Cosmonaut/blob/33b7e6625150019d892d81a0b5c5f7d6981bcad6/src/Cosmonaut/Extensions/CosmosResultExtensions.cs#L51

You are calling to GetSingleOrFirstFromQueryable. Without setting RequestContinuation there is a call to GetResultsFromQueryToList which performs the query and store the results in a list. Even that stopOnAny is set to true, I suspect that you are getting from Cosmos more than 1 entry as a result. You are using the FirstOrDefault after the query. Isn't it inefficient?

Don't you need to set FeedOptions.MaxItemCount = 1 in that case, in order to force Cosmos to return only 1 result and to reduce the amount of the transferred data (which also will reduce the cost)?

Elfocrash commented 5 years ago

Hey @rhalaly,

Great observation.

By setting the MaxItemCount in the background for First and Single type operations to 1 we can save both cost and performance as we transfer less data over the wire. The reason why it's not that inefficient is because it will stop on any results so the trips over the wire are minimal. The payload on the other hand could be limited to a single entry.

Methods that can be improved by this are:

The only concern is that at this point FeedOptions are already part of the internal DocumentQueryable which means that if I want to set the MaxItemCount I would have to do it though reflection.

For that type of operations Cosmonaut uses an in internal type cache so it shouldn't be much of a problem performance wise.

I will work on this thanks for reporting it.

rhalaly commented 5 years ago

Thanks for your quick response!

Why do you have to use Reflection? Can't you just use queryable.GetFeedOptionsForQueryable() in GetResultsFromQueryToList in order to get the FeedOptions instance?

I think that replacing the stopOnAny with something like forcedMaxItemCount (nullable) that force replace the MaxItemCount in the FeedOptions instance (if it set) will do the trick...

Elfocrash commented 5 years ago

GetFeedOptionsForQueryable is using reflection to get the feed options instance for the queryable and SetFeedOptionsForQueryable is doing the same. It's impossible to manipulate otherwise because it's internal the the fields are private.

stopOnAny is still needed because MaxItemCount doesn't represent the max items that the whole operation will return but this specific ExecuteNextAsync. This means that if you have 100 items in the collection and you set MaxItemCount to 1 you will get all 100 items in pages of 1 items each.

rhalaly commented 5 years ago

Ho, true... So yes, stopOnAny is still needed. :)

Elfocrash commented 5 years ago

Issue fixed in 6b7a22504f06ae8d95fadf8a804dd2eb9ebec83a