UPDATE: There was an Azure Marketplace solution for Unreal Engine Pixel Streaming, which simplifies the deployment process and adds things like lifecycle management, a custom metrics dashboard and more: https://azuremarketplace.microsoft.com/marketplace/apps/epicgames.unreal-pixel-streaming?tab=Overview; however, this solution has been taken down from the Marketplace and Epic Games has open sourced it here: https://github.com/ue4plugins/UnrealPixelStreamingOnAzure
Important: Before cloning this repo you must install the LFS extension at: https://git-lfs.github.com/ and open a git/console command window and type git lfs install to initialize git-lfs. Then in your cloned folder, you need to run "git lfs install". There are large binaries in the repo, thus we needed to enable Git Large File Storage capabilities. Also, due to licensing we are unable to include \Engine\Binaries\ThirdParty
dlls exported from Unreal for your app in this repo, so you'll need to copy your own Binaries\
folder into the repo and check them in before the PixelStreamingDemo.exe
app will run locally and remotely. See the Unreal 3D App section for details of this and other important steps.
Important: The main branch of this repo supports 3D Applications targeting Unreal Engine 4.27. If your application uses the previous 4.26 version of Unreal Engine please change the branch to ue-4.26
or use the v4.26
release tag.
This document goes through an overview on how to deploy Unreal Engine's Pixel Streaming technology in Azure at scale, which is a technology that Epic Games provides in their Unreal Engine to stream remotely deployed interactive 3D applications through a browser (i.e., computer/mobile) without the need for the connecting client to have GPU hardware. Additionally, this document will describe the customizations Azure Engineering has built on top of the existing Pixel Streaming solution to provide additional resiliency, logging/metrics and autoscaling specifically for production workloads in Azure. The additions built for Azure are released here on GitH, which consists of an end-to-end solution deployed via Terraform to spin up a multi-region deployment with only a few Terraform commands. The deployment has many configurations to tailor to your requirements such as which Azure region(s) to deploy to, the SKUs for each VM/GPUs, the size of the deployment, HTTP/HTTPs and autoscaling policies (node count & percentage based).
For a detailed overview of Unreal Pixel Streaming and architectures in Azure, see our documentation here. For a more simplified quick-start for the process on manually deploying to a single VM with Matchmaker and Signaling Server in Azure, see the Microsoft documentation here. To jump directly to the documented steps for deploying this solution in Azure, click here.
Microsoft has worked with Epic to customize Pixel Streaming for the cloud using Microsoft Azure, which has resulted in many key additions to deploy and monitor a Pixel Streaming solution at scale (some can be found here: GitHub PR request #7698). Below are the notable additions that have been incorporated into a Fork of Unreal Engine on GitHub:
Let's walk through the general flow of what is showed in the architecture diagram above when a user connects to the service:
Below are the recommended compute SKUs for general usage of Pixel Streaming in Azure:
Important: It is recommended to first deploy your Pixel Streaming executable and run it on your desired GPU SKU to see the performance characteristics around CPU/Memory/GPU usage to ensure no resources are being pegged and frame rates are acceptable. Consider changing resolution and frames per second of the UE4 app to achieve acceptable quality per your requirements. Additionally, consider the IOPS / latency requirements for the 3D app when choosing a disk, as SSDs and/or striping disks will be key to gaining the best disk speed (some GPU SKUs might not support Premium SSDs so also consider disk striping for adding IOPS).
Be sure to check out the Pixel Streaming in Azure Overview documentation to learn more about optimizing for Azure VM SKUs, performance and pricing optimizations.
The current customized solution in GitHub has many additions that make deploying Pixel Streaming in Azure at scale easier, and below are even more improvements on those customizations which would make it even better:
Below are notable configurations to consider when deploying the Pixel Streaming solution in Azure.
There was a tremendous amount of work that went into building out the Terraform deployment for Pixel Streaming; however, unless you plan on making major modifications you can focus just on the following 3 files:
iac\terraform.tfvars: This stores the global variable for the deployment_regions, which specify which Azure region(s) will be used (default is "eastus") and their Virtual Network ranges:
iac\region\variables.tf: This is the most important file to be familiar with, as it has the configs for the gitpath (change to your Git fork), pixel_stream_application_name (change to your UE4 app name), along with other notable parameters such as desired FPS (default 60), resolution (default 1080p), starting instance count (default 1), instances per node (default 1), and Azure VM SKUs for the MM (default Standard_NV6) and SS (Standard_F4s_v2).
iac\variables.tf: This global variables file can mostly be ignored, unless needing to change the global resource group's name (base_resource_group_name), location (global_region, default: eastus), Traffic Manager port or storage account settings (tier/type).
See the Terraform section to learn more about the deployment files.
The Git location referenced in the deployment is stored in the iac\region\variables.tf file. Important: You must have read access with a Personal Access Token (PAT) to the specified repository for the deployment to work, since when the VMs are created there is a git clone used to deploy the code to the VMs. Also, you'll want to validate if your organization needs to have Enterprise SSO enabled for your PAT.
Below are the configurations available to the Matchmaker, which a config.json file was added to the existing Matchmaker code to reduce hard coding in the Matchmaker.js file:
{
// The port clients connect to the Matchmaking service over HTTP
"httpPort": 80,
// The Matchmaking port the Signaling Service connects to the matchmaker over sockets
"matchmakerPort": 9999,
// Instances deployed per node, to be used in the autoscale policy (i.e., 1 unreal app running per GPU VM) – not yet supported
"instancesPerNode": 1,
// Amount of available Signaling Service / App instances to be available before we must scale up (0 will ignore)
"instanceCountBuffer": 5,
// Percentage amount of available Signaling Service / App instances to be available before we must scale up (0 will ignore)
"percentBuffer": 25,
//The amount of minutes of no scaling up activity before we decide we might want to see if we should scale down (i.e., after hours--reduce costs)
"idleMinutes": 60,
// % of active connections to total instances that we want to trigger a scale down if idleMinutes passes with no scaleup
"connectionIdleRatio": 25,
// Min number of available app instances we want to scale down to during an idle period (idleMinutes passed with no scaleup)
"minIdleInstanceCount": 0,
// The total amount of VMSS nodes that we will approve scaling up to
"maxInstanceScaleCount": 500,
// The Azure subscription used for autoscaling policy (set by Terraform)
"subscriptionId": "",
// The Azure Resource Group where the Azure VMSS is located, used for autoscaling (set by Terraform)
"resourceGroup": "",
// The Azure VMSS name used for scaling the Signaling Service / Unreal App compute (set by Terraform)
"virtualMachineScaleSet": "",
// Azure App Insights ID for logging and metrics (set by Terraform)
"appInsightsId": ""
}
Below are configs available to the Signaling Server in their config, some added by Microsoft for Azure:
{
"UseFrontend": false,
"UseMatchmaker": true, // Set to true if using Matchmaker.
"UseHTTPS": false,
"UseAuthentication": false,
"LogToFile": true,
"HomepageFile": "player.htm",
"AdditionalRoutes": {},
"EnableWebserver": true,
"matchmakerAddress": "",
"matchmakerPort": "9999", // The web socket port used to talk to the MM.
"publicIp": "localhost", // The Public IP of the VM -- set by Terraform.
"subscriptionId": "", // The Azure subscription -- set by Terraform.
"resourceGroup": "", // Azure RG -- set by Terraform.
"virtualMachineScaleSet": "", // Azure VMSS -- set by Terraform.
"appInsightsId": "" // Azure App Insights ID for logging/metrics -- set by Terraform.
}
In some cases, you might need a STUN / TURN server in between the UE4 app and the browser to help identify public IPs (STUN) or get around certain NAT'ing/Mobile carrier settings (TURN) that might not support WebRTC. Please refer to Unreal Engine's documentation for details about these options; however, for most users a STUN server should be sufficient. Inside of the SignallingWebServer\
folder there are PowerShell scripts used to spin up the Cirrus.js service which communicates between the user and the UE4 app over WebRTC, and Start_Azure_SignallingServer.ps1
or Start_Azure_WithTURN_SignallingServer.ps1
are used to launch with STUN / TURN options. Currently the Start_Azure_SignallingServer.ps1
file points to a public Google STUN server (stun.l.google.com:19302
), but it's highly recommended to deploy your own for production. You can find many other public options online as well (e.g., 1, 2). Unreal Engine exports out stunserver.exe
and turnserver.exe
when packaging up the Pixel Streaming 3D app to setup on your own servers (not included in repo):
\Engine\Source\ThirdParty\WebRTC\rev.23789\programs\Win64\VS2017\release\
Start_Azure_SignallingServer.ps1
is called by runAzure.bat
when deploying the Terraform solution, so if a TURN server is needed this can be changed in runAzure.bat to call Start_Azure_WithTURN_SignallingServer.ps1
with the right TURN server credentials updated in the PS file.
The Unreal 3D app and dependencies reside in GitHub (Git-LFS enabled) under the Unreal\ folder. The Unreal\ folder structure aligns with what is exported out of Unreal Engine, and below are the specific files\folders you will want to copy over the existing files provided in the example GitHub repository:
<ProjectName>.exe
should replace Unreal\PixelStreamingDemo.exe
<ProjectName>\
folder associated with the <ProjectName>.exe
should replace the Unreal\PixelStreaming\
folder.\Engine\Binaries\
), as the third-party dlls and versions contained in the \Engine\Binaries\ThirdParty
folder are specific to what was used in your 3D application. Due to licensing we are not able to include the .dlls in this repo, so it's important that you add them yourself. Make sure you then can click on your `The Unreal application has some key parameters that are passed in upon startup, which the Terraform deployment and PowerShell script (startVMSS.ps1
) handles for you:
<PixelStreamingApp>.exe -AudioMixer -PixelStreamingIP=localhost -PixelStreamingPort=8888 -WinX=0 -WinY=0 -ResX=1920 -ResY=1080 -Windowed -RenderOffScreen -ForceRes
Notable app arguments to elaborate on for your understanding (see Unreal docs for others):
-ForceRes
: It is important to make sure this argument is used to force the Azure VM's display adapter to use the specified resolution (i.e., ResX
/ResY
).-RenderOffScreen
: This renders the app in the background of the VM, so it won't be seen if RDP'ing into the box, which ensures that a window won't be minimized and not stream back to the user.-Windowed
: If this flag isn't used the resolution parameters will be ignored (i.e., ResX
/ResY
).-PixelStreamingPort
: This needs to be the same port specified in the Signaling Server, which is the port on the VM that the communicates with the 3D Unreal app over web sockets.Microsoft has added the ability to autoscale the 3D stream instances up and down, which is done from new logic added to the Matchmaker which evaluates a desired scaling policy and then scales the Virtual Machine Scale Set compute accordingly. This requires that the Matchmaker has a System Assigned Managed Service Identity (MSI) for the VM with permissions to scale up the assigned VMSS resource, which is setup for you already in the Terraform deployment. This eliminates the need to pass in special credentials to the Matchmaker such as a Service Principal, and the MSI is given Contributor access to the region's Resource Group that was created in the deployment—please adjust as needed per your security requirements.
Here are the key parameters in the Matchmaker config.json required to configure on autoscaling for the Signaling Server and 3D app (VMSS nodes). Important: Be sure to check in any config changes back into your forked repo as the Terraform deployment pulls from GitHub on your deployment and not your local resources.
instanceCountBuffer : Min amount of available streams before triggering a scale up (0 will ignore this). For instance, if you have 5 it will only trigger a scale up if only 4 or less streams are available.
percentBuffer : % of available streams before triggering a scale up (0 will ignore this). For instance, if you have 25 it will trigger a scale up if less than 25% of total connected Signaling Servers are available to stream.
idleMinutes : How many minutes of no new scale operations before considering a scale down (e.g., scale down after hours)
connectionIdleRatio : % of active streams to total instances that we want to trigger a scale down after idleMinutes passes.
minIdleInstanceCount : The number of VMSS nodes we want during an idle period (e.g., never go below 10 nodes)
maxInstanceScaleCount : The max number of VMSS nodes to scale out to (e.g., never scale above 250 VMs)
When Unreal Pixel Streaming is packaged from Unreal Engine the solution contains a \Engine\Source\Programs\PixelStreaming\WebServers\SignallingWebServer\player.htm
file to customize the experience, along with the ability to customize JavaScript functions to send custom events between the browser and the 3D Unreal application. Please see Epic's robust documentation on how to make these extra customizations.
This section will walk through all the steps necessary to deploy this solution in Azure. Currently the deployment expects a Windows OS as it references powershell.exe directly, though a simple symlink of pwsh to powershell.exe on Linux apparently works (will be added in a future release). Important: Be sure to first follow the guidance in the Configurations section to setup the git repo location.
To deploy the solution, use the steps here:
git clone --depth 1 https://github.com/Azure/Unreal-Pixel-Streaming.git
az login
az account set --subscription "SUBSCRIPTION_NAME_HERE"
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope LocalMachine
terraform.tfstate
file in the iac\
folder if existing from a previous deployment.terraform init
terraform validate
terraform apply -var 'git-pat=PUT_GIT_PAT_HERE' --auto-approve
This process can take between 15-30 minutes to deploy, depending on resources deployed (i.e., UE4 app size, regions chosen, etc.). The PowerShell window should finish with a successfully completed message. The deployment creates 2 resources groups in Azure:
<random_prefix>-global-unreal-rg
: This stores all global resources such as the Traffic Manager, Key Vault and Application Insights.<random_prefix>-<region>-unreal-rg
: This stores the Virtual Machine Scale Set (VMSS) for the GPU nodes that have the 3D app and Signaling Server, the Matchmaker VM and Virtual Network resources.Testing the Deployment: Open up a web browser and paste in the DNS name from the Traffic Manager in the global Resource Group (e.g., http://<random_prefix>.trafficmanager.net
) to be redirected to an available stream. The DNS name can be found under "DNS name: <link>" in the Overview page of the Traffic Manager resource in the Azure Portal. If you've deployed to multiple regions, you will be redirected to the closet Azure region.
Post the deployment there are processes that Terraform will run on the following solution components in each region upon startup of each VM:
startMMS.ps1
Node.exe
StartMMS
startVMSS.ps1
Node.exe
<PixelStreamingApp>.exe
StartVMSS
The easiest way to redeploy during the solution would be to do the following for each piece:
terraform.tfstate
file in the iac\
folder.terraform init
terraform validate
terraform apply -var 'git-pat=PUT_GIT_PAT_HERE' --auto-approve
If we need to shut down the solution and start it up later, see below for the process. This is just shutting down the compute for the Matchmaker and the Signaling Servers, which are the costlier resources (especially the SS GPU VMs) vs. deleting all the resources and requiring a time-consuming redeployment.
Matchmakers
Signaling Servers / 3D app
Matchmakers
StartMMS
setup on Windows.Signaling Servers / 3D app
StartVMSS
setup on Windows.Currently automated Azure dashboards aren't built when deploying the solution; however, outside of regular host metrics like CPU/Memory, some key metrics will be important to monitor in Azure Monitor/Application Insights such as:
SSPlayerConnected
– The most key metric to know when a user connected (use Count)SSPlayerDisconnected
– When a user disconnects from the Signaling Server (use Count)AvailableConnections
– The amount of available Signaling Servers not being used (use Avg)TotalConnectedClients
– Amount of Signaling Servers connected to the Matchmaker (use Avg)TotalInstances
– The total number of VMSS instancesPercentUtilized
– The percentage of Signaling Servers (streams) in use (use Avg)MatchmakerErrors
– The number of Matchmaker (use Count)View a tutorial on creating a dashboard in Azure Monitor here.
In supporting the deployed solution, it is recommended to do a few key things:
Below are the key files in the Terraform setup to understand when altering the code and tweaking the parameters.
\iac
is the root of all infrastructure for the solution.
\iac\region
is the folder with the files to deploy a region
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
© 2021, Microsoft Corporation. All rights reserved