Azure / Azure-Functions

1.12k stars 199 forks source link

WebSockets on Azure Functions #738

Open paulbatum opened 6 years ago

paulbatum commented 6 years ago

We've received feedback from customers that Azure Functions should support WebSockets to enable realtime scenarios. See here for some discussion: https://github.com/Azure/azure-functions-host/issues/1139

There are some challenges here because WebSocket is really a stateful protocol (you have a long lived connection between a given client and a given server) while Azure Functions is designed to be stateless. For example, we will often deprovision a Function App from one VM and start it on a different VM (perhaps in a different scale unit) based on capacity constraints. This is safe to do for us today because of our stateless design - but if we did it when there were WebSockets connections to that VM, we'd be terminating those connections.

There are also open questions from a billing perspective. Today the consumption plan for Azure Functions bills per execution, but if a WebSocket server is listening on an idle connection, this consumes resources but where are the "executions"?

In summary, there are open issues around design, billing, and technical implementation. I think there are reasonable solutions to these issues and we just need to start working on them once this feature is high enough on our list of priorities.

drcrook1 commented 6 years ago

Do we have an eta on this? I have a customer in need of this.

paulbatum commented 6 years ago

No plans or ETA right now. Any details you can share on the customer scenario would be useful in building a case for prioritizing this feature.

mikezks commented 6 years ago

Azure SignalR Service sounds like a good fit for this. @paulbatum, what is your opinion on this?

Edit: Did I get this right?

paulbatum commented 6 years ago

I agree it might be a great complement to Azure Functions. I'm keen to try it out and see how well the two services can work together, just need to find the time :) If you end up giving it a shot please let us know how it works out.

mikezks commented 6 years ago

This scenario describes a possible way to use SignalR Service with a Function App and no need for a full web app. Nevertheless - technically - this does not seem to be the best possible way, because between client and SignalR Service a websocket connection is open, but I still need a normal HTTP API call to trigger the Function App and send my request.

So this would work and at least we have realtime updates for the client without SSE, long polling or interval requests, but of course it would be better to use the already established websocket connection for client requests too.

It would be great to call the SignalR Service API from the client and then the SignalR Service triggers one of the already implemented Function App triggers (HTTP, WebHook, etc.) or a Service Bus Queue.

paulbatum commented 6 years ago

Thanks for sharing that example!

mikeclymer commented 6 years ago

@mikezks and @paulbatum - Not sure if you have seen this, but here's another data point for using the SignalR Service with Functions: Serverless notifications with Azure Cosmos DB + Azure Functions + Azure SignalR - The associated blog post: Serverless real-time notifications in Azure using Azure #CosmosDB

My use case for WebSockets on AF comes from wanting to build GraphQL APIs with AF. I would like to be able to make use of something like graphql-subscriptions.

Aron4u commented 5 years ago

I see that this here is an old post, but It comes up when you google the topic "azure function websocket" :) Now with durable functions , I wonder if there is any news here? @paulbatum

paulbatum commented 5 years ago

I think the most recent news on this topic is that the SignalR service and binding for functions both went into general availability. This article is a good starting point of learning how to use Azure SignalR Service and Azure Functions together.

I welcome comments on this issue regarding scenarios that are not adequately addressed through the approach discussed above.

wpitallo commented 5 years ago

How would you address scaling well beyond the 1000 client limit with SignalR service? My understanding is that it only supports 1000 connected users is that correct?

paulbatum commented 5 years ago

@wpitallo The azure pricing calculator is a useful source of this type of information. For example, this screenshot shows that you get 1000 concurrent connection per unit and it currently supports up to 100 units (so a max of 100,000 concurrent connections)

image

https://azure.microsoft.com/en-us/pricing/calculator/?service=signalr-service

ggirard07 commented 5 years ago

Any plan to trigger a function with a websocket based trigger? Webhook and Event Grid are great for running in the cloud, but a nightmare to develop and test when your organization does not allow you to use ngrok...

paulbatum commented 5 years ago

@ggirard07 EventGrid has the ability to deliver events to a storage queue or event hub (see here for an example), and then you could use the appropriate trigger to consume those events. Would work both locally and in the cloud.

wpitallo commented 5 years ago

Awesome thanks @paulbatum, with regards to the number of units can this scale elastically and is there any additional configuration required in order to scale across multiple units? I have been looking for information online but there is not to much detailing how to scale?

paulbatum commented 5 years ago

@wpitallo Sorry I don't know. I am not an expert on the signalr service, I suggest you follow up elsewhere with this question.

wpitallo commented 5 years ago

@paulbatum no problem, thx for the help :)

ggirard07 commented 5 years ago

@paulbatum the aim of using websocket if to get rid of the polling time involved by the storage queue, table, blob monitoring. This is especially true as we look forward to break some services in a multitude of functions for a better microservice/serverless architecture (which involves using way more functions than the 2 webjobs we currently have).

paulbatum commented 5 years ago

@ggirard07 got it. In that case I think you're describing an event grid feature (support delivery over websocket) rather than an azure functions feature. I do want to point out that event hubs is not polling based - the application maintains an AMQP connection with the event hub and recieves events in realtime. So you might find that configuring event grid to deliver your events to an eventhub and having an azure functions event hub trigger gives you the latency characteristics you're looking for.

StephenCleary commented 5 years ago

I have a use case that I think will become more common in the next few years: for an IoT-based solution, I'd like to provide a realtime API that doesn't put restrictions on consumers (i.e., consumers can use any language or platform).

Currently, Azure is great for enterprises building their own servers and clients, but we don't have a good offering for exposing realtime APIs as a product like Azure API Management does for REST.

galvesribeiro commented 5 years ago

Hey folks, just dropping my 2c on it...

AWS provide that exactly feature by separating who is holding the stateful connection from who is processing the packet. The API gateway hold the stateful connection, since it is a service billed by time while the processing logic is handled on lambda, which is basically billed by execution just like functions.

So, if we bring it to Azure, that gives us two options:

  1. Make API Management or App Gateway to hold the stateful connections as it is billed by time, and then forward every packet to a function using a specific binding/trigger type.
  2. Make SignalR Services to hold the client connections and update it to support "backend" functions, which would process the incoming packets.

Option 2 seems more obvious and simpler as both the billing model and the technical requirements are already in place. SignalR Core already allow pure websockets to connect to it as far as I remember. So the Hub protocol is not necessary a problem. If you want to use it with the hub protocol, it is even simpler as you could map a hub method call to a function.

I think Azure has everything already in place to allow that... It is just a matter of put the things together...

Looking forward to see that happening...

galvesribeiro commented 5 years ago

For the record, the current binding on Azure Functions for SignaR Service is Function -> SignalR Service -> Clients. What (I think) is being discussed here and what I'm suggesting is the other way around...

anthonychu commented 5 years ago

Clients -> SignalR Service -> Event Grid -> Functions is being considered. Would this help with your scenario @galvesribeiro?

galvesribeiro commented 5 years ago

Yes and no.

Yes if you don't care about when the message is processed.

No if you want it to be fast. EventGrid is not fast and lacks some guarantees.

Also, with EventGrid you wouldn't be able to make Request/Response scenarios on Hubs and also to propagate method invocation exceptions up. Only Push one-way up or down the whole stack.

I'm not saying you shouldn't have support to EventGrid. I'm just saying that direct access to functions would be the best for the majority of cases.

galvesribeiro commented 5 years ago

Actually, not just functions... Any HTTP endpoint (even outside Azure) could be used. Functions would just be much more convenient to configure at the SignalR Service portal, and also allow people to use Bindings to map the method parameters properly.

For example, in a function method, you could have something like this:

public async Task<IHttpActionResult> MyHubMethod([SignalRService]SignalRServiceContext context, string param1, int param2, string paramX)

Where the context holds information about the SignalRService that sent the message so you can just push back the response in case you need it.

miaojiang commented 5 years ago

@galvesribeiro @StephenCleary I am from the API Management team and we are considering supporting Websockets. I'd love to get your feedback. Let me know if you'd like to jump on a quick call to discuss your scenarios.

galvesribeiro commented 5 years ago

@miaojiang yeah, I'll be back to work next week so we can have a call then.

I was just wondering why we would need API management for that if the SignalR service itself could be plugged into Azure functions...

What we need is a serverless backend for SignalR Service Hub methods. And that is a 1:1 parity with Azure Functions and HTTP Trigger IMHO... I wonder were API Management would fit on it.

I understand that it would have some benefits to having API Management as one frontend option of SignalR Services, but I don't understand why it would be mandatory...

Thanks!

anthonychu commented 5 years ago

I think here are the gaps identified in this thread on what SignalR Service is missing today:

All of these have been discussed. Perhaps @sffamily @chenkennt @bradygaster @davidfowl from the SignalR and SignalR Service teams can comment.

bradygaster commented 5 years ago

Teasing apart the items in the thread, and thanks for adding us, @anthonychu:

  • Clients need a way to invoke Azure Functions via the SignalR connection (and get a response)

This is interesting but a methodology I've not considered. I briefly chatted with @mattchenderson about this today to get a little more clarity on the idea but we'll need to sync up to go deeper. This feels like the important item in this thread, a whole issue unto itself. Is the goal here to trigger a Function's execution from within a Hub? Or to trigger a Function's execution when a SignalR event fires? Almost as if the handler for the SignalR hub event is the Function?

  • Want to use raw WebSockets without the SignalR protocol

This is very interesting. It is also something @davidfowl has investigated with the Azure SignalR Service team.

  • SignalR SDKs need to support more platforms

I'd be interested in which additional platforms and what you mean by "supporting more platforms." Could you be more specific?

  • SignalR Service is billed by pre-provisioned units and does not autoscale, API Gateway has consumption pricing (connection-minutes).

This feels like an orthogonal topic for a different time. w.r.t. the API Gateway/API Management discussion, I also think that's a third, also-orthogonal discussion. Whilst I see merit to a marriage of API Management and WebSockets, I don't think APIM + WebSockets would be a solution here. I have other ideas on an API Management and SignalR marriage, but that's even more orthogonal (think "discovery language" like OAS, but that's way, way out and maybe not even valid).

galvesribeiro commented 5 years ago

This is interesting but a methodology I've not considered. I briefly chatted with @mattchenderson about this today to get a little more clarity on the idea but we'll need to sync up to go deeper. This feels like the important item in this thread, a whole issue unto itself. Is the goal here to trigger a Function's execution from within a Hub? Or to trigger a Function's execution when a SignalR event fires? Almost as if the handler for the SignalR hub event is the Function?

This is precisely what we envision it. Instead of having us to code and host the Hub class, we would like to just "Describe" on the Azure SignalR Service a Hub. Where on this description I can just say Method SayHello Implemented by FunctionApp1.FunctionMethodA(). Everything related to authentication/authorisation like for example Auth Token, etc, would be forwarded to to the Function so it can deal with it. Basically It is like if the pseudo-hub described on the SignalR Service, just invoke with a POST the target function and expect the response to return to the caller.

The other scenarios like Stream APIs and Server -> Client messages are already supported today, so no changes are required.

Want to use raw WebSockets without the SignalR protocol

Idk how it would be implemented but, in the end of the day, you will have to dispatch the processing logic of the buffer/message being received on the WebSocket to something else, so I expect the previous solution of having a backend function for it would work as well, the same way as if a function want to send messages to the client.

SignalR SDKs need to support more platforms

I think this is out of topic for this particular issue but, the more platforms you want to offer support, then more people will be able to use the service. However, the Hub protocol has documentation on GitHub and I was able for example to create an Unreal engine client in C++ (with @davidfowl help) that works pretty fine. The problem that I see is that the protocol must be better (or even more) documented and exposed in a simpler way on docs.microsoft.com so people looking to use their own clients can easily find it without relying on MSFT own implementations for it.

w.r.t. billing, I think that is also out of question here as it involves other product like you said, API Gateway and API Management, and I've also already shown a sample from AWS on how Websockets (i.e. any persistent connection) works and is billed so similar behaviour could be adopted.

anthonychu commented 5 years ago

This is precisely what we envision it. Instead of having us to code and host the Hub class, we would like to just "Describe" on the Azure SignalR Service a Hub. Where on this description I can just say Method SayHello Implemented by FunctionApp1.FunctionMethodA(). Everything related to authentication/authorisation like for example Auth Token, etc, would be forwarded to to the Function so it can deal with it. Basically It is like if the pseudo-hub described on the SignalR Service, just invoke with a POST the target function and expect the response to return to the caller.

We've already discussed a few ways to do this with @chenkennt and team. This could be a webhook from the service. There are implications for how much load this adds to the service and how to configure them.

Another way we've talked about handling this is via Event Grid (much like how we can listen to connect/disconnect events today). The benefits are that it's more predictable from a server load perspective, as slow responses from arbitrary webhooks are less of an issue, and the auth and filtering capabilities of EG allow the subscriber to filter which hub and method name they want to listen to. The downside is more latency and no way to respond to an invocation.

Currently, clients have to invoke functions directly via HTTP. While this could be considered inefficient in some scenarios, it's likely to be the lowest latency option and allows functions to respond.

I'd be interested in which additional platforms and what you mean by "supporting more platforms." Could you be more specific?

That item was specific to @StephenCleary's comment on iOS.

davidfowl commented 5 years ago

I'm all for supporting something like this am currently working on something similar but I have a few questions because I'm somewhat unclear on the scenarios:

bzbetty commented 4 years ago

@davidfowl not the OP, but I can explain my usecase.

Essentially I'm looking for a way of streaming microphone audio from a browser to an Azure Function so it can put it through Cognitive Services (plus other things). I don't strictly need SignalR (although I am using it for Azure Function -> Browser communication) but the Cognitive Services portion does require some kind of stream.

I could probably fall back to the JS cognitive services API and just push the results to a webapi, but I could see a use for Azure Functions to accept a stream somehow.

davidfowl commented 4 years ago

@bzbetty if each video frame was an HTTP request would that meet your needs?

bzbetty commented 4 years ago

@bzbetty if each video frame was an HTTP request would that meet your needs?

We're only doing audio, but I assume with a given bitrate there'd be something equivalent to a frame.

We looked at using an HTTP Trigger but that didnt make it possible to pass the byte stream through to cog services speech to text.

But keeping a connection to cog services open between frames/request is stateful which really does break the Azure Functions paradigm, so we may be going down the wrong path with this (I just really wanted the azure functions scaling ability)

It would likely be much simpler to call cog services from the browser and just post the result + related audio bytes to a function afterwards and given my end result would work perfectly fine if I could split the audio sentence by sentence it does seem like a sensible approach.

galvesribeiro commented 4 years ago

@davidfowl

For incoming messages do you want the bytes forwarded from the client directly?

I think receive the raw bytes would not be something that really work. I mean, if the intent is to let a function process a message and the byte[] sent to it is part of the message, chances are that the next websocket frame will either be sent to another function instance or even if you have just one, the state may be lost either way as functions are stateless by definition.

I think this scenario is only applicable for JSON-RPC like message formats or pre-defined formats like protobuf or string-delimiters. Another option is to just forward a full method invocation on a SignalR Hub.

In other words, the point would be to have the SignalR Service as a full stateful connection with its clients and only forward messages to the function when they are somehow completed and ready to be processed/replied. Add EventGrid for some cases may work, but it is too slow IMHO and would block RPC scenarios.

davidfowl commented 4 years ago

Help me understand why being stateless is a problem. I understand the performance angle but what are the other things people are trying to solve by using functions vs an app service instance?

I think this scenario is only applicable for JSON-RPC like message formats or pre-defined formats like protobuf or string-delimiters. Another option is to just forward a full method invocation on a SignalR Hub

The message would be an entire websocket frame, or in the case of signalr and invocation (signalr hubs are also stateless)

In other words, the point would be to have the SignalR Service as a full stateful connection with its clients and only forward messages to the function when they are somehow completed and ready to be processed/replied. Add EventGrid for some cases may work, but it is too slow IMHO and would block RPC scenarios.

Yes we’re not talking about sending partial messages, just entire messages (either websocket or signalr invocations in the case of signalr). I want to understand if that OK for the scenarios people are thinking about. It’s turning your websocket message into a stateless http request.

galvesribeiro commented 4 years ago

Help me understand why being stateless is a problem. I understand the performance angle but what are the other things people are trying to solve by using functions vs an app service instance?

Sure, imagine that you have a message that is comprised of 100bytes. In the pure websockets scenario without SignalR, if the function is stateless, eventually, it would receive partial data on a frame, meaning that a single message could be split on multiple frames (i.e. 2 frames with 50 bytes) and that would require the function to (1) be alive while all the chunks of data comprising a single message arrive (by keeping an in-memory buffer) or (2) share the chuck with other function instances so it can "continue" reading the chunks in order to process the whole message. If either the function dies or the next chuck is handled by another function instance, you have incomplete data. Not to mention the time of the function activations. In order to make the function fully stateless, the SignalR service would need to know the message protocol to delivery the right amount of bytes in a single function call so it can process the whole message. There are ways to keep the state across function executions like what is done by Durable functions where you can send multiple chunks on diff function instances and "complete" the message processing in a durable way but that not just increase the latency a lot and reduce the throughput, but also the costs may skyrock. Durable functions were not designed for that.

The message would be an entire websocket frame, or in the case of signalr and invocation (signalr hubs are also stateless)

Yes. On the case of SignalR Hub protocol, like I said, SignalR service will know the message protocol and is able to forward Hub method invocations to a single function invocation and can reply to the called the response in an RPC fashion. That IMHO is the best scenario this issue should be directed to. In a future, other RPC-like services like gRPC could be added the same way as SignalR to the service since they would work exactly same way, just diff wire/message protocol and transport but with a Function as the invocation backend.

Yes we’re not talking about sending partial messages, just entire messages (either websocket or signalr invocations in the case of signalr). I want to understand if that OK for the scenarios people are thinking about. It’s turning your websocket message into a stateless http request.

Yes, that is the case I wanted to have enabled on SignalR Service + Az Functions. When I mentioned partial data is just so other people that is commenting here about sending byte[]s with the intent of "real streaming" would be aware of this limitation.

But yes, you are right.

So, to summarise:

  1. SignalR Service accept a client connection
  2. The configured OnConnected() handler, a function in this case, will be invoked with the connection context as a parameter
  3. SignalR Service receives a hub invocation request
  4. The configured handler (either a generic handler or a per-method-specified), a function in this case, receives the context and the method parameters are forwarded to it
  5. If the invocation has a return type, the function return back to the client

The registered handler can be anything the service decide to support. I would originally design it to support Azure Functions and EventGrid. EventGrid would cover the other use cases like service bus, EventHub, data factory, etc.

galvesribeiro commented 4 years ago

As I was mentioning to @davidfowl Yesterday, I was tasked to migrate some GraphQL APIs once written in TypeScript with Apollo framework to Azure Functions with .Net Core/C# using graphql-dotnet.

The overall migration works just fine except when it comes to subscriptions. GraphQL Subscriptions spec show how it should work and it would be great if we could do it with SignalR Services.

However, unlike we discussed before, the default implementation of clients expect the transport to be websockets and not something like SignalR Hub Protocol.

For that particular scenario, have SignalR Service to support basic websockets would be good.

Today it is possible to do that using AWS API Gateway and Lambdas. It would be great to do the same with SignalR Service + Functions.

I hope it enlighten more one use case for this feature.

simonzhow commented 4 years ago

Curious to know if there has been any movement on this since the end of last year? Currently running into a road block implementing subscriptions alongside apollo-server-azure-functions.

sffamily commented 4 years ago

@galvesribeiro @simonzhow @bzbetty , and other developers in this thread, sorry cannot find your contact so ping you here. I'm the Product Manager of Azure SignalR Service, and we have a new internal preview for supporting raw WebSocket in serverless scenarios, thinking you might be interested to try it out and offer us some feedback. If you are interested please ping me at zhshang at microsoft dot com.

galvesribeiro commented 4 years ago

Hello @sffamily thanks for getting back to us!

If you can contact me offline on gutemberg 4T outlook D0T com, I can give you the subscription IDs that I would like to test on the private preview.

I've poke @davidfowl few weeks ago about it and was really looking forward to try it out!

Thanks!

Odonno commented 4 years ago

Hi @galvesribeiro

Do you have any blog post on your Apollo Server + GraphQL in .NET? I don't think there is any resource about that for the moment.

exajai commented 4 years ago

hey guys , are there any further updates on supporting web sockets for azure functions ?

paulbatum commented 4 years ago

@exajai no updates on the functions side, but you might want to take a look at the raw websocket support that is being added to the signalr service, mentioned above.

davidskuza commented 4 years ago

So for now the only option besides SignalR is to host your own WebSocket handler on virtual machine to trigger Azure Functions through HTTP requests to emulate AWS Gateway, right?

paulbatum commented 4 years ago

App Service has websocket support, so you could use a similar approach to what you described, but use webapps instead of virtual machines (no need to manage OS updates, etc).

djpirra commented 4 years ago

Hi Paul, can you describe that a little bit more in detail?

I am currently designing a .Net Core console app that connects to a streaming WSS source and sends the data to CosmosDB.

What are the pros and cons of using a VM or a WebApp and what would change in terms of design/code? Also, what are the risks in terms of availability and failure recovery?

Thanks

paulbatum commented 4 years ago

This is a complicated topic, I can't really go into detail here.

This document might help you reason about the different types of compute provided by Azure: https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/compute-decision-tree

This is the overview for app service: https://docs.microsoft.com/en-us/azure/app-service/overview

You can follow the documentation links from there to go into more detail on availability, recovery, etc.

SeaDude commented 4 years ago

Excellent doc reference @paulbatum , thank you.

m-sterspace commented 3 years ago

I'm relatively new to websockets, and was just looking to implement a multi-file upload via websockets for the first time, but am I correct in understanding from this thread that I can't use Azure Functions for that and have to use SignalR? It doesn't appear on that compute decision tree...

Also, am I correct in understanding that SignalR is only available in C#/.NET, not in Javascript/node.js?