Open mathewc opened 8 years ago
We identified 2 canonical scenarios for why this is important:
Error handling - Are events sent if there's an error in the function ?
void Error([Queue] IAsyncCollector<Payload> q)
{
q.AddAsync(…);
throw Exception(); // Is message queued?
}
Ordering - important when an event refers to another output. IE, we write a blob, and then enqueue a message that refers to that blob.
Work([Queue] IAsyncCollector<Payload> q, [Blob] TextWriter output)
{
Output.WriteLine(…);
var msg = … // refers to output blob!
q.AddAsync(msg);
}
What’s even more sinister is that this can lead to non-determinisim. (I believe solving these non-determinism issues is also a huge-value add for WebJobs/Functions). There’s a race, maybe the blob is written first during testing; and the order is different in production.
This is easy when if we only send a small number of rows via IAsyncCollector because we could just buffer in -memory and force an ordering. But it’s prohibitive if we send huge payloads.
Putting in 2.0.0-release since it's a breaking change
FWIW, I noticed the same issue with queues and tables, where messages would be added to the queue directly but rows in the table only after the function completed. As a workaround I switched to adding rows to tables manually rather than using outputs but it would definitely be better if we could use outputs for this kind of scenarios.
A proposed solution is:
@waldekmastykarz - would that solve your situation?
In my example I used ICollector
@waldekmastykarz -- To have table items added immediately, can you call FlushAsync()
immediately after the call to AddAsync()
? That doesn't really solve the overall inconsistencies we have, but it may get your scenario working.
In my Function I've been already using ICollector<>.Add() with both the queue and the table. The difference in the behavior I've seen is related to using ICollector<> not IAsyncCollector.
ICollector & IAsyncCollector should behave identically here w.r.t. flushing semantics. If they're not, that's going to be very confusing and we should fix it.
Hi, since this is quite old, but I am wondering if ServiceBus IAsyncCollector will support batching still in Functions V1?
for the ICollector
inconsistency between queue and table : https://github.com/Azure/azure-webjobs-sdk/issues/921#issuecomment-267250116
Can we change the implementation of the SyncAsyncAdaptor
:
public void Add(T item)
{
_inner.AddAsync(item).GetAwaiter().GetResult();
_inner.FlushAsync().GetAwaiter().GetResult(); // flush after each add
}
this will make the following statement true, with/without batching/buffering
@mathewc ; @alrod is working on batch support for Service Bus, so it would be good to sync up on this to make sure we're going dow the right path. This will be on a new major version of the binding, so we want to follow whatever we think is the best practice for this.
Hello Everyone,
We have been using EventHub out binding to send event to another eventhub. Basically we are calling FlushAsync() in our azure function.
I am trying to understand how i may write the retry logic in our side when FlushAsync() failed for some reason(it happened once in last 1 year due to timeout issue). I am asking because I am not sure how it works in background. Also having little document about this.
Just wondering, whether just calling the FlushAsync() after some time or need to first add the events again to AddAsync() and call FlushAsync().
I hope i'd expect to combine with AddAsync() and FlushAsync(). Please confirm.
any help on this one would be much appreciated.
thanks!
@alrod Have changes been made to IAsyncCollector to support batching in Service Bus? I'm still seeing individual calls to ServiceBus when adding an item with AddAsync.
@lukasvosyka thanks for the link, I can see that batching is available for receiving messages from Service Bus. My question pertains to sending messages in batches to Service Bus. In my testing on 4.1.1 of the Microsoft.Azure.WebJobs.Extensions.ServiceBus nuget package, I'm still seeing individual calls to Service Bus when adding a message with AddAsync to an IAsyncCollector. It does not appear that batching is working for sending, unless there is a configuration option I am missing? I noticed that there is still a request out to document host.json for configuration.
@jdiekhoff Yes, I realized that afterwards :-) That's why I removed my reply then .. sorry for confusion.
After some digging, it seems that the underlying Service Bus client SDK doesn't support batching, at least not for the .Net Standard variant. https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-performance-improvements?tabs=net-framework-sdk#client-side-batching
@lukasvosyka Ah, ok. Thanks for clearing that up! Would still love to see batch operations for a servicebus output.
@lukasvosyka Despite what those docs say, the newest SDK does support batching now for Service Bus. See here: https://github.com/Azure/azure-sdk-for-net/blob/cb05f2e989d3cd3658084ae725f90041476bb8a7/sdk/servicebus/Microsoft.Azure.ServiceBus/src/QueueClient.cs#L346
I'd really like to see this added to the ICollector and IAsyncCollector interfaces as it would dramatically speed up some of my functions that read in a CSV file and output messages. Are @mathewc and @alrod still working on this?
I've been using IAsyncCollector for Azure queues for a while ... and recently I've started seeing, very very occasionally, that the queue message never seems to actually make it to the queue. At least, the consumer of the queue never receives it. Since there's no possible way for an error to surface, it's hard to tell what's happening. FWIW, the AddAsync call is pretty much the last thing in the function.
Think I'm going to try switching the call to ICollector
Precictably, of course, I don't have a repro for this :-(
@AdrianPell are you seeing that with no changes on your side? If so, it might be worth sharing more details in a new issue so we can take a closer look here. Any errors, in addition to details on how you're hosting/using the feature (e.g. Functions, standalone WebJobs, hosting plans/model, etc.) would be helpful.
What is the official MS recommendation for service bus.. Using the Asynch Collector or using SDK client?
See http://stackoverflow.com/questions/40732715/azure-functions-icollectort-vs-iasynccollectort
We're inconsistent across bindings and extensions in our IAsyncCollector implementations, and whether they batch or add immediately. For example:
Generally, I think bindings should batch if the underlying service supports efficient batch operations. For example, Azure Table batch operations (our table binding actually currently does NOT take advantage of this). However, a lot of our other bindings seem to be batching somewhat arbitrarily - e.g. SendGrid/Twilio. My initial thought for doing batching for SendGrid was to support scenarios where the messages are only sent on success. However, users could control that themselves in their method if they wish. We might want to revisit this.