Azure / azure-webjobs-sdk-extensions

Azure WebJobs SDK Extensions
MIT License
344 stars 206 forks source link

Cosmos DB Output binding shall support Bulk Execution mode for high throughput scenarios #820

Open petr-hollay opened 1 year ago

petr-hollay commented 1 year ago

The binding must support Bulk Execution mode by setting the following property:
https://docs.microsoft.com/cs-cz/dotnet/api/microsoft.azure.cosmos.cosmosclientoptions.allowbulkexecution?view=azure-dotnet

Expected behavior

Bulk Execution can be set in the binding attributes for compiled C# Functions.

Actual behavior

Build Execution mode can not be set using the existing output binding.

Known workarounds

Go around the standard output binding and use Cosmos DB SDK directly.

Related information

Introducing Bulk support in the .NET SDK - Azure Cosmos DB Blog (microsoft.com)

Bulk support improvements for Azure Cosmos DB .NET SDK - Azure Cosmos DB Blog (microsoft.com)

ealsur commented 1 year ago

Thanks for the suggestion and feedback.

The problem with having this as a configuration attribute is, if the user sets it and then in the function calls sequentially the save method, then they are shooting themselves on the foot.

[FunctionName("WriteDocsIAsyncCollector")]
public static async Task Run(
    [QueueTrigger("todoqueueforwritemulti")] ToDoItem[] toDoItemsIn,
    [CosmosDB(
                databaseName: "ToDoItems",
        containerName: "Items",
                WithBulkMode = true,
        Connection = "CosmosDBConnection")]
        IAsyncCollector<ToDoItem> toDoItemsOut,
    ILogger log)
{
    foreach (ToDoItem toDoItem in toDoItemsIn)
    {
        await toDoItemsOut.AddAsync(toDoItem);
    }
}

That will generate more latency.

There is no API on IAsyncCollector that can avoid this pitfall at the moment.

ealsur commented 1 year ago

Another biggest issue is that Bulk mode in C# depends on customers concurrently executing operations. Because other languages (Python/NodeJS/etc) use Bundles that run on top of C#, how does this work on other languages as well? Do all languages support concurrent execution? Isn't NodeJS single thread?

praneeth-nimmagadda commented 10 months ago

Is Bulk Execution mode supported in the cosmos output triggers?. I don't see that option at all , is there any specific version to see the bulk execution mode option ?

ealsur commented 10 months ago

No, it is not supported as explained in the existing comments.