Badgerati / Pode

Pode is a Cross-Platform PowerShell web framework for creating REST APIs, Web Sites, and TCP/SMTP servers
https://badgerati.github.io/Pode
MIT License
845 stars 92 forks source link

Async Route feature with Callback and Server-Sent Events (SSE) support #1349

Closed mdaneri closed 1 month ago

mdaneri commented 3 months ago

This commit introduces the Async Route feature for Pode, including Callback and Server-Sent Events (SSE) support for improved asynchronous communication.

Benefits:

New Features:

New Functions:

Features

Independent Runspace Pools:

Security:

Callback Support:

SSE Support:

Tests:

Example Usage:

Add-PodeAsyncGetRoute:

Add-PodeAsyncGetRoute -Path '/task' -ResponseContentType  'application/json', 'application/yaml'  -In Path -Authentication 'MergedAuth' -Access 'MergedAccess' -Group 'Software' -TaskIdName 'myTaskId'

Add-PodeAsyncStopRoute:

Add-PodeAsyncStopRoute -Path '/task' -ResponseContentType 'application/json', 'application/yaml' -In Query -Authentication 'MergedAuth' -Access 'MergedAccess' -Group 'Software' -TaskIdName 'myTaskId''pippopppoId'

Add-PodeAsyncQueryRoute:

 Add-PodeAsyncQueryRoute -Path '/task' -ResponseContentType 'application/json', 'application/yaml' -In Query -Authentication 'MergedAuth' -Access 'MergedAccess' -Group 'Software' -TaskIdName 'myTaskId'

Set-PodeAsyncRoute:

Add-PodeRoute -PassThru -Method Put -Path '/auth/asyncUsing' -Authentication 'MergedAuth' -Access 'MergedAccess' -Group 'Software'   -ScriptBlock {
        return @{ InnerValue = 'something' }
    } | Set-PodeAsyncRoute -ResponseContentType 'application/json', 'application/yaml' -Callback -PassThru -CallbackSendResult -Timeout 300 | Set-PodeOARequest  -RequestBody (
        New-PodeOARequestBody -Content @{'application/json' = (New-PodeOAStringProperty -Name 'callbackUrl' -Format Uri -Object -Example 'http://localhost:8080/receive/callback') }
    )

Other functions:

Are the internal functions equivalent to route operations. The only difference is that there is no security involved. The main purpose of these functions are manipulate the internal state of the async routes.

Badgerati commented 3 months ago

Hey! I managed to get time to review πŸ˜„

I would recommend raising an Issue first for larger feature work, so it can be discussed before diving into the solution 😜

If I'm right, this is a wrapper for Routes, which sets the Route logic as an async task with optional routes for info retrieval?

If so, I'd recommend the following:

mdaneri commented 3 months ago

Regarding the Get-PodeData, Get-PodeQuery, etc., I agree with your suggestion. I'll create a PR for this enhancement.

For the function names, I agree that your suggested names are better. I hadn't spent much time on naming, so this feedback is helpful.

However, I have a question about the need for a separate Remove-PodeAsyncRoute. Remove-PodeRoute seems to cover the functionality since you cannot remove an async route without removing the route itself.

On the topic of merging the functionality with PodeTask, I have some reservations. PodeTask serves a very specific purpose that doesn't align perfectly with async REST calls. The latest commit includes an option to specify the maximum number of threads that each route can execute, which is a crucial feature. Some routes cannot be run concurrently or must have a limited number of concurrent executions due to the heaviness of the process.

mdaneri commented 3 months ago

Regarding the webhook. In this context, a callback is the appropriate outbound method. I'm going to extend the callback already in place to allow a complete interpretation of the callback semantics https://swagger.io/docs/specification/callbacks/

mdaneri commented 3 months ago

Another part that still I need to implement is the security. So far anyone can see everything. I need to us the roles and groups to limit access to the async results

Badgerati commented 3 months ago

specify the maximum number of threads that each route can execute, which is a crucial feature. Some routes cannot be run concurrently or must have a limited number of concurrent executions due to the heaviness of the process

I feel this is one that could be also achieved with Tasks as well, on top of #1037. There's need there to specify that certain tasks should only be run sequentially, and here we have a need to limit the number of a Task running concurrently, even sequentially at times - the two could probably be solved with the same solution, enabling the requirement here and enhancing Tasks at the same time:

Then 2 potential options:

  1. We introduce a new -Isolated switch on Add-PodeTask (and Set-PodeAsyncRoute), this enables a ParameterSet with advanced functionality to control threading - in this case likely just -MaxThreads for now, and setting this to 1 forces sequential processing only. If MaxThreads isn't supplied then the internal default of $PodeContext.Threads.Tasks is used.
    • Tasks with -Isolated create a separate Runspace Pool. Having this switch makes it safer so that people don't accidentally create a mass amount of Runspace Pools.

For example:

# global
Add-PodeRoute ... | Set-PodeAsyncRoute -ResponseContentType Json

# isolated and sequential
Add-PodeRoute ... | Set-PodeAsyncRoute -ResponseContentType Json -Isolated -MaxThreads 1

or,

  1. This is one of thought about in the past. We have a Add-PodeRunspacePool public function which lets people specify a -Name and a -MaxThreads.
    • On Add-PodeTask and Set-PodeAsyncRoute there's a new -RunspacePoolName. If this is supplied then the Tasks run on the specified pool, if not passed then they run on the global Task pool. (It'll likely need protection to stop people running Tasks on other internal pools, hah)
    • This would also allow for an isolated pool for multiple select Tasks/Route Tasks to run on - rather than 1 to 1.

For example:

# global
Add-PodeRoute ... | Set-PodeAsyncRoute -ResponseContentType Json

# isolated and sequential
Add-PodeRunspacePool -Name 'CustomPool' -MaxThreads 1
Add-PodeRoute ... | Set-PodeAsyncRoute -ResponseContentType Json -RunspacePoolName 'CustomPool'

This way we don't have duplicated logic, and improve Tasks all around.

mdaneri commented 3 months ago

I like the idea of the isolated parameter. But to be honest, I don't see a problem with having 1000 runspaces. A runspace uses no resources other than a small quantity of memory.

In a Pode project, I'm not expecting to see 1000 async routes; if that's the case, I doubt that all of them are used simultaneously. In the end, the number of running threads is the only thing that matters

At the moment, the way it works is like this :

Add-PodeRoute ... | Set-PodeAsyncRoute -ResponseContentType Json  -MaxThreads 2

As for the idea of using the same code for Task and Async, I'm only partially convinced it's feasible without compromising the compatibility with the current API.

ConvertTo-PodeEnhancedScriptBlock does all the magic by injecting the user code inside the "async" envelope, which is completely different from how the PodeTasks are managed. The only similar thing is Start-PodeAsyncRoutesHousekeeper, but I want to find a way to remove it. I was thinking of using an individual scheduler to clean up each async process.

mdaneri commented 3 months ago

The callback implementation is completed. Now, it is missing only the security part and the -Isolated switch

mdaneri commented 3 months ago

I’m looking at how to integrate SSE. It’s a very useful feature when you using an async call from a browser

mdaneri commented 2 months ago

Documentation is done the only part missing is SSE documentation and some minor fixes to the OpenAPI definition

Badgerati commented 2 months ago

I'm back from holiday, so I'll begin reviewing this one and the Runspace one as soon as I can :)

mdaneri commented 2 months ago

Runspace is simple There are just 2 functions to make the debugging easier and a small document that explains that

Badgerati commented 2 months ago

Hey @mdaneri, I'm gradually getting through the review, just a slow one atm! Please try not to commit anything to the PR while I go through, as it'll confuse the ongoing review πŸ˜„ I'm hoping to finish the rest of the review this week.

While going through work the ContentType parameters reminded me of a feature I was toying with a few months back which might actually help out a lot here. I've been mapping the idea to the work here, and so far it seems like a good match; when I get chance I'll write it up, but in short it's an alternative to the Write-PodeXResponse functions and the way ContentTypes are figured out - and respecting the Accept header more, similar to how you have here.

mdaneri commented 2 months ago

I was thinking of making a small change, but I can postpone it to the next release. In this implementation, when you query for an async task, there is no limit to the number of objects you can get back. I was thinking of adding a limit of 100 configurable.