Azure / azure-webjobs-sdk

Azure WebJobs SDK
MIT License
738 stars 358 forks source link

Support triggering a single function on multiple queues (wildcard queue name) #808

Open DarranShepherd opened 8 years ago

DarranShepherd commented 8 years ago

Example scenario: A webjob that monitors all *-poison queues in a storage account, with a single generic action taken on all (e.g. write queue message to blob for inspection, email/Slack/SMS to ops team to alert of failure).

I've looked into implementing this as an extension, but rapidly ended up with 90% of the code being duplication of the core SDK due to much of the infrastructure behind QueueTriggerAttribute being marked as internal.

Would you be willing to consider a pull request that creates a MultipleQueueTriggerAttribute that accepts a regex to match against queue names, creates a listener monitoring all matching queues and adds a queueName binding in addition to the existing bindings?

public static void HandlePoisonQueues(
    [MultipleQueueTriggerAttribute(".*-poison")] string message,
    string queueName,
    [Blob("poison/{queueName}/{id}.txt")] TextWriter blob,
    TextWriter logger)
{
    logger.WriteLine($"Poison message received on {queueName}")
    blob.Write(message);
}
CarlosSardo commented 7 years ago

Another interesting and useful scenario:

To overcome the Azure Storage Queue scalability limits more easily.

A single queue can process approximately 2,000 messages (1KB each) per second (each AddMessage, GetMessage, and DeleteMessage count as a message here). If this is insufficient for your application, you should use multiple queues and spread the messages across them.

Defining a Regex on the queue name: [MultipleQueueTriggerAttribute("orders[0-9]")] string message

Passsing several queue names: [MultipleQueueTriggerAttribute("orders1", "orders2", "orders3")] string message

ivorwan commented 5 years ago

Another interesting and useful scenario:

To overcome the Azure Storage Queue scalability limits more easily.

A single queue can process approximately 2,000 messages (1KB each) per second (each AddMessage, GetMessage, and DeleteMessage count as a message here). If this is insufficient for your application, you should use multiple queues and spread the messages across them.

Defining a Regex on the queue name: [MultipleQueueTriggerAttribute("orders[0-9]")] string message

Passsing several queue names: [MultipleQueueTriggerAttribute("orders1", "orders2", "orders3")] string message

Personally, I would prefer a regex or wildcard binding. That way, in order to scale, we would simply create a new azure queue, without having to update the function

oluatte commented 5 years ago

Hey folks,

Thanks for all your hard work.

Is there any progress on this request or is it being considered at all? We're running into both scenarios above (a lot of poison queues to handle in a common way & approaching queue throughput limits) and this seems like a nice solution.

Thanks

alexgman commented 5 years ago

I need a function to be triggered by multiple queues. Any progress here?

cocowalla commented 4 years ago

Has anyone from Microsoft had a look at this, and is it on the backlog?

For massive scale and better tenancy isolation, 1 queue per tenant would be ideal - but that would also require one function binding per queue too, which is far from ideal (and likely to hit some scaling limitation, which are rife throughout all Azure services)

VictorGazzinelli commented 4 years ago

Any updates? Has anyone found a workaround for this?

onurleblebici commented 3 years ago

After five years, any progress or update? At least a workaround?

vachillo commented 3 years ago

Also requesting a new feature/workaround for this situation

alexgman commented 3 years ago

We will take your request into consideration.

On Fri, May 7, 2021 at 12:48 PM vachillo @.***> wrote:

Also requesting a new feature/workaround for this situation

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Azure/azure-webjobs-sdk/issues/808#issuecomment-834651652, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAIE34URGKLX6ZAECQ67Z3TMQRYHANCNFSM4COKFDKA .

dotnetnate commented 3 years ago

Bueller? At 5 years and ticking, this would be nice to have.

J0nKn1ght commented 2 years ago

It would also be good to have the list of queue names config-driven.

In our case, we have a function that processes analytics events (i.e. the queue payload may differ, but is always sent to the same http endpoint), and have segregated the events into different queues based on message frequency, so we can scale out high-frequency queues.

In production, we want to use separate Azure Function hosts for each queue, but on test servers, we can easily process all queues using the same host function.

If we could set a queue trigger to process the specified config entry (e.g. '%queue%') as we currently do, but that entry could be a delimited list of queue names instead of a single one, that would be ideal for us. (It would also require no code changes - only config changes.)

I guess if this could also incorporate support for wildcards in queue names, you'd have the best of both worlds.

chitter99 commented 1 year ago

Would be kinda cool if we can have this :)

alexgman commented 1 year ago

I think if you need this it's an indication of poor design

On Wed, Jan 25, 2023, 7:46 AM Aaron Schmid @.***> wrote:

Would be kinda cool if we can have this :)

— Reply to this email directly, view it on GitHub https://github.com/Azure/azure-webjobs-sdk/issues/808#issuecomment-1403649243, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAIE336I4YAEPARR4WJYWLWUEVFFANCNFSM4COKFDKA . You are receiving this because you commented.Message ID: @.***>

Simon-Gregory-LG commented 1 year ago

Fantastic, it seems we've stumbled across a pro design expert!

Maybe you could bless us with some of your wisdom on how best to tackle a couple of example scenarios:

Priority Queue Triggers

Say you have n queues that represent different 'priorities' and you want to process them in some predetermined priority order using a scalable Azure Function. Techniques could be either to exhaust the queues in order of their priority, or even to pop each batch guided by some target proportional ratio (e.g. 3 queues High, Medium, Low you want to trigger in batches of 10 in the ratio 6:3:1).

To achieve this in Azure Functions you could use a TimerTrigger and roll your own logic. However timer triggers are limited to be used on one instance only so are not scalable and would require additional custom logic for exponential backoff outside of peak times etc which TimerTriggers do by default. If MultipleQueueTriggers existed I imagine it would have some kind of ordering logic to pop queues using some predetermined processing logic and seem to me to be a very natural way to solve this problem out of the box.

But maybe I'm missing something, could it be that:

Bulk Handling Poison Queues

This is effectively the use-case as described in the original question. You have n queues and you want to process any of their poison messages in the same manner using an Azure Function. Should we write n functions to handle each, or deploy n function AppServices each hosting their own function app configured for each queue?

I have some sympathy that wildcards might be an antipattern / open to abuse, but surely a MultipleQueueTrigger would be a natural solution for this, even if the queue names are defined / updated in the configuration.

Or maybe:

Thanks for your insight and valuable contribution.

alexgman commented 1 year ago

Gut shabbos!

On Thu, Jan 26, 2023, 3:19 AM Simon-Gregory-LG @.***> wrote:

Fantastic, it seems we've stumbled across a pro design expert!

Maybe you could bless us with some of your wisdom on how best to tack a couple of example scenarios: Priority Queue Triggers

Say you have n queues that represent different 'priorities' and you want to process them in some predetermined priority order using a scalable Azure Function. Techniques could be either to exhaust the queues in order of their priority, or even to pop each batch guided by some target proportional ratio (e.g. 3 queues High, Medium, Low you want to trigger in batches of 10 in the ratio 6:3:1).

To achieve this in Azure Functions you could use a TimerTrigger and roll your own logic. However timer triggers are limited to be used on one instance only so are not scalable and would require additional custom logic for exponential backoff outside of peak times etc which TimerTriggers do by default. If MultipleQueueTriggers existed I imagine it would have some kind of ordering logic to pop queues using some predetermined processing logic and seem to me to be a very natural way to solve this problem out of the box.

But maybe I'm missing something, could it be that:

  • Priority Queues are an antipattern and should never be used?
  • Systems shouldn't have load / scaling requirements beyond what can be done with a single timer trigger?
  • Azure Queues and/or Service Bus already possess little known intrinsic queue prioritisation functionality?
    • AFAIK they don't. I know Service Bus has additional functionality with sessions etc, but you still can't do implement ratio exhaustion method with that.
  • We shouldn't be using Azure Functions to process queues?
  • Something else?

Bulk Handling Poison Queues

This is effectively the use-case as described in the original question. You have n queues and you want to process any of their poison messages in the same manner using an Azure Function. Should we write n functions to handle each, or deploy n function AppServices each hosting their own function app configured for each queue?

I have some sympathy that wildcards might be an antipattern / open to abuse, but surely a MultipleQueueTrigger would be a natural solution for this, even if the queue names are defined / updated in the configuration.

Or maybe:

  • We should all be using just one queue for everything?
  • We should ensure that queue messages can never end up on poison ever, for any reason?
  • We should be using the non-scalable TimerTrigger mechanism and roll our own?
  • Each queue function should be developed with a partner poison method in every app (even if they are all to do the same thing), because mass code duplication isn't the antipattern that everybody thinks it is?
  • Something else?

Thanks for your insight and valuable contribution.

— Reply to this email directly, view it on GitHub https://github.com/Azure/azure-webjobs-sdk/issues/808#issuecomment-1404733405, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAIE32D7PLB5U7KSFZNOS3WUI6TTANCNFSM4COKFDKA . You are receiving this because you commented.Message ID: @.***>

dotnetnate commented 9 months ago

What's one more year and ticking? I know I enjoy sprawling deployments and infrastructure and would hate to have anything nice or useful that could actually address this.

SeanFeldman commented 9 months ago

If multiple queues need to be processed by a single processor (Azure Functions in this case), look at Azure Service Bus and the Auto-forwarding feature. The feature was designed years ago specifically for this use case. Scenarios such as fan-in and centralized dead-letter queue are trivial to implement and do not require anything from the processor.

dotnetnate commented 8 months ago

Which is great if you want to take on that complexity and you aren't subject to constraints that autoforwarding won't work with. That's ignoring the purely philosophical issue of your infrastructure having to be altered and managed because of a product deficiency that could be rather easily addressed.

Simon-Gregory-LG commented 8 months ago

Which is great if you want to take on that complexity and you aren't subject to constraints that autoforwarding won't work with. That's ignoring the purely philosophical issue of your infrastructure having to be altered and managed because of a product deficiency that could be rather easily addressed.

Very good points, it also doesn't solve Priority Queues (at least it didn't the last time I checked).

SeanFeldman commented 8 months ago

That's ignoring the purely philosophical issue of your infrastructure having to be altered and managed because of a product deficiency that could be rather easily addressed.

Please share details about the easy-to-address part.

SeanFeldman commented 8 months ago

@Simon-Gregory-LG, Given that ASB doesn't support priority queues, I'm not sure how you expect Functions to come up with that feature without inventing something it's not designed to do.

Simon-Gregory-LG commented 8 months ago

@Simon-Gregory-LG, Given that ASB doesn't support priority queues, I'm not sure how you expect Functions to come up with that feature without inventing something it's not designed to do.

The problem doesn't need to be solved by the queue infrastructure technology itself though. The issue is essentially being able to check and exhaust the Queues in a predetermined order in a scalable way.

If you have a MultiQueueTrigger, it's going to need to poll each queue in some predetermined order anyway, which as a consequence would allow you to solve this problem too. Same principle could be used for a 'multi' ServiceBusTrigger.

SeanFeldman commented 8 months ago

ASB Functions trigger (queue or subscription) is based on the Processor from the SDK. It works with long polling of a single entity. Assuming the team would abandon that implementation and implement their message pump, it would need N receivers. If someone has many queues that match the pattern for the suggested MultipleQueueTriggerAttribute, it will end up with a type of code that will be brutal to maintain and misaligned with the Functions modular architecture IMO. But I'm not on the Functions team and cannot talk on their behalf.

Simon-Gregory-LG commented 8 months ago

ASB Functions trigger (queue or subscription) is based on the Processor from the SDK. It works with long polling of a single entity. Assuming the team would abandon that implementation and implement their message pump, it would need N receivers. If someone has many queues that match the pattern for the suggested MultipleQueueTriggerAttribute, it will end up with a type of code that will be brutal to maintain and misaligned with the Functions modular architecture IMO. But I'm not on the Functions team and cannot talk on their behalf.

Oh yes I agree that the wildcard request would be a nightmare and difficult to implement. I think a Multi trigger where the Queues are defined upfront is what would have value. Basically similar to how it is now, but not limited to one predefined queue and it polls those Queues in a loop (or whatever the strategy is).

Btw does the queue trigger actually use long polling? I always thought it's just a timer where the interval increases exponentially to some limit while the queue is empty.

Interestingly there's an Azure article on priority queue strategies https://learn.microsoft.com/en-us/azure/architecture/patterns/priority-queue

It mentions a single pool strategy, but I'm not aware of any out of the box azure / functions technology to achieve that.

Multi pool is possible if you're happy to duplicate lots of code and /or infrastructure, but that can put pressure downstream services and doesn't let you synchronize messages sessions / partitions in the way that single pool would.

SeanFeldman commented 8 months ago

Btw does the queue trigger actually use long polling?

Yes, it does.

Simon-Gregory-LG commented 8 months ago

Btw does the queue trigger actually use long polling?

Yes, it does.

Are you talking about service bus or azure storage Queues?

Looking at the source for Azure Storage Queues QueueTrigger looks like a standard timer.

https://github.com/Azure/azure-webjobs-sdk/blob/master/src/Microsoft.Azure.WebJobs.Extensions.Storage/Queues/Listeners/QueueListener.cs

There's always a processing delay of MaxPollingInterval if the Storage Queue has been empty for a while, you'd expect Long Polling to be instant in that case.

This may be the difference between ASB and Storage Queues.

One other idea I had is that you have a GroupName property and a Priority property on the standard QueueTrigger attribute, that way it may be possible to achieve the result by passing an array of Queues for each group to the above listener - this would preserve the modular convention it currently has. This is off-topic from the MultiQueueTrigger idea, but might allow the capability for priority queue processing with azure storage Queues at least. Sounds like ASB would need something more sophisticated tho.

SeanFeldman commented 8 months ago

Are you talking about service bus or azure storage Queues?

I have been explicitly talking about Azure Service Bus (ASB). Storage Queues are handy but their share in the wild is significantly lower to invest so much effort.

Simon-Gregory-LG commented 8 months ago

Are you talking about service bus or azure storage Queues?

I have been explicitly talking about Azure Service Bus (ASB). Storage Queues are handy but their share in the wild is significantly lower to invest so much effort.

That explains it, although I think this thread is specifically referring to the Azure Storage QueueTrigger. Sounds like that technology is dead then.

SeanFeldman commented 8 months ago

The original request is from 2016. Even back then, ASQ was already in the back mirror. Some systems still use it. It's a very simple queuing service to start with, but it does not feature the reach and practicality of ASB.