Closed tristanbarcelon closed 5 years ago
@KalyanChanumolu-MSFT, Sorry for the duplicate post 33047. I must have pressed enter key twice. Is there a way to ensure CosmosDB emulator is kept running once started?
Hi @David-Noble-at-work , I found a previously closed issue 7990 that mirrors very closely what I am trying to do. Would starting the emulator using Start-CosmosDBEmulator cmdlet from a scheduled task after boot keep the emulator running until a reboot and allow multiple builds to start accessing the emulator using different database names concurrently?
Would invoking Start-CosmosDBEmulator from a powershell task in a build definition somehow cause the emulator to stop running at anytime during the build or cause the emulator to not be available for the next build job on the same server?
I've also been checking to see if there is already an existing valid certificate, using the Get-CosmosDbEmulatorCertificate. If not, I also invoke New-CosmosDbEmulatorCertificate cmdlet before calling Start-CosmosDBEmulator with the following parameters set: DefaultPartitionCount of 100, a custom DataPath, NoUi, NoTelemetry, and FailOnSslCertificateNameMismatch.
While invoking the cmdlet this way from Powershell worked for me last week, it doesn't seem to be working this week even though the certificates are valid. Start-CosmosDBEmulator just times out when it failed to reach Running status after 4 minutes.
@tristanbarcelon Thanks for the feedback. We are actively investigating and will get back to you soon.
@KalyanChanumolu-MSFT , have you found anything so far? We are still getting intermittent connectivity issues from time to time when attempting to run unit tests that rely on the emulator.
@tristanbarcelon I tried to mimic your set up and reproduce the issue. I started an emulator instance and started parallel console apps accessing multiple databases (like multiple build jobs executing unit tests) and depending on the intensity of the operation, the emulator does slow down. The emulator doesn't have the same level of performance as the actual cosmos instance itself.
I believe you should try the Cosmos DB Emulator Build task so that you get multiple isolated emulator instances.
Is it feasible to spin up an actual instance of CosmosDB on Azure, run your unit tests and destroy it after the build completes?
Here are the findings from the investigation to date:
I’ve done some initial comparisons between successful and failing builds. Here are some things I noticed.
In successful instances: • It takes about 13 mins 30 secs from the beginning of pulling the container until the container is running. It usually takes the container about 3 minutes to switch from the “started” to the “running” state.
In the failure instances: • It takes about 15 to 16 mins and 40 seconds from starting to pull the container until failing to run the container. The time is taken between the container reporting “started” to the task failing is usually around 4 mins and 20 secs.
Which means that the time taken to just pull the container is pretty similar between success and failure cases at around 10 to 11 mins. The time taken to switch the container from “Started” to “running” is the what’s fluctuating here which makes me suspect it’s something wrong with running containers or something wrong with the container itself.
So, what might be the difference if you run the Cosmos Emulator locally (installed) versus having the Cosmos Emulator as part of the container and run that locally as part of the pipeline?
Thanks Mike for pitching in. What I believe is happening here is there are 12 - 16 builds trying to access the same Emulator instance with a database of its own and executing different tests. I doubt if the Emulator is built to service so many concurrent requests across several databases. With the container however, there is flexibility to spin up more than one emulator instance vs having just one on a VM.
Thanks for your investigation. Here's the customized "Start-MyCosmosDBEmulator" function which I use in an Azure Devops task group right before the unit test task. I wrote this instead of using Start-CosmosDBEmulator because I wanted it to first find an installed instance of CosmosDB emulator, search for valid certs, etc. so when it does fail, it will fail with an intelligible error. Also, installing CosmosDB emulator does not place its PS modules in the PSModulePath.
` function Start-MyCosmosDBEmulator { [CmdletBinding()] param( [Parameter(Mandatory = $false, HelpMessage = 'Port to listen on')] [ValidateRange(1024,49151)] [uint16] $Port = 8081, [Parameter(Mandatory = $false, HelpMessage = 'Database persistence path')] [string] $DataPath = "$($ENV:LOCALAPPDATA)\CosmosDBEmulator", [Parameter(Mandatory = $false, HelpMessage = 'Default partition count per container')] [ValidateRange(25, 200)] [uint16] $DefaultPartitionCount = 150 ) try { $CosmosDBInstallation = Get-InstalledWindowsProgram -Name 'Azure Cosmos DB Emulator' -Verbose
if (($null -ne $CosmosDBInstallation) -and ($CosmosDBInstallation.Count -gt 0))
{
if ([string]::IsNullOrEmpty($CosmosDBInstallation.InstallLocation))
{
throw 'Unable to find Cosmos DB InstallLocation property in the registry'
}
if (Test-Path -Path $CosmosDBInstallation.InstallLocation -PathType Container)
{
[string] $CosmosDBEmulatorExe = Get-ChildItem -Path $CosmosDBInstallation.InstallLocation -Filter 'CosmosDB.Emulator.exe' | Select-Object -ExpandProperty FullName
if ([string]::IsNullOrEmpty($CosmosDBEmulatorExe))
{
throw "Unable to find Microsoft.Azure.Cosmos.Emulator.exe from path $($CosmosDBInstallation.InstallLocation)"
}
if (-not(Test-Path -Path $DataPath -PathType Container))
{
New-Item -Path $DataPath -ItemType Directory | Out-Null
}
$LocalCosmosDBCertificates = Get-ChildItem -Path 'Cert:\LocalMachine\My' | Where-Object { @('DocumentDbEmulatorCertificate', 'CosmosEmulatorSecrets') -icontains $_.FriendlyName `
-and (( $_.NotBefore -lt [System.DateTime]::Now ) -and ( [System.DateTime]::Now -lt $_.NotAfter )) }
[string] $EmulatorPSModulePath = Join-Path -Path $CosmosDBInstallation.InstallLocation -ChildPath 'PSModules\Microsoft.Azure.CosmosDB.Emulator'
if (-not (Test-Path -Path $EmulatorPSModulePath -PathType Container))
{
Write-Warning 'Unable to find PSModules\Microsoft.Azure.CosmosDB.Emulator path. Invoking emulator executable directly'
if ($null -eq (Get-Process -Name '*Cosmos*.Emulator' -ErrorAction SilentlyContinue))
{
if ($null -eq $LocalCosmosDBCertificates)
{
Start-Process -FilePath $CosmosDBEmulatorExe -ArgumentList @('/NoUI', '/GenCert') -Wait -NoNewWindow
Write-Verbose "Generated a self-signed SSL certificate using $CosmosDBEmulator /GenCert"
}
if ($PSBoundParameters.ContainsKey('DataPath') -and ($DataPath -ine "$($ENV:LOCALAPPDATA)\CosmosDBEmulator"))
{
Start-Process -FilePath $CosmosDBEmulatorExe -ArgumentList @('/NoUI', '/NoExplorer',
"/Port=$($Port)", "/DataPath=`"$($DataPath)`"", "/DefaultPartitionCount=$($DefaultPartitionCount)") -Wait -NoNewWindow
}
else
{
Start-Process -FilePath $CosmosDBEmulatorExe -ArgumentList @('/NoUI', '/NoExplorer',
"/Port=$($Port)", "/DefaultPartitionCount=$($DefaultPartitionCount)") -Wait -NoNewWindow
}
Write-Verbose "Started Azure Cosmos DB emulator from $CosmosDBEmulatorExe using Port: $($Port) and DataPath: $($DataPath)"
}
else
{
Write-Verbose 'There is already an Azure Cosmos DB emulator process running'
}
}
else
{
Import-Module $EmulatorPSModulePath -Verbose:$false
if ($null -eq $LocalCosmosDBCertificates)
{
#Why is New-CosmosDBEmulatorCertificate buggy and always returning this error
#New-CosmosDbEmulatorCertificate : Certificate generation failed with exit code At line:1 char:1
Write-Verbose 'Generating localhost certificate using New-CosmosDbEmulatorCertificate'
New-CosmosDbEmulatorCertificate -ErrorAction SilentlyContinue -Verbose
}
if (-not ((Get-CosmosDbEmulatorStatus) -ieq 'Running'))
{
if ($PSBoundParameters.ContainsKey('DataPath') -and ($DataPath -ine "$($ENV:LOCALAPPDATA)\CosmosDBEmulator"))
{
Start-CosmosDBEmulator -Port $Port -DataPath $DataPath -DefaultPartitionCount $DefaultPartitionCount -NoUi -NoTelemetry -FailOnSslCertificateNameMismatch -Verbose
}
else
{
Start-CosmosDBEmulator -Port $Port -DefaultPartitionCount $DefaultPartitionCount -NoUi -NoTelemetry -FailOnSslCertificateNameMismatch -Verbose
}
Write-Verbose "Started Azure Cosmos DB emulator with PSModule Microsoft.Azure.CosmosDB.Emulator using Port: $($Port) and DataPath: $($DataPath)"
}
else
{
Write-Verbose 'Azure Cosmos DB emulator process is already running according to Get-CosmosDbEmulatorStatus'
}
}
}
else
{
throw "Cosmos DB appears to be installed but InstallLocation $($CosmosDBInstallation.InstallLocation) could not be found"
}
}
else
{
throw "Cosmos DB emulator is not installed"
}
}
catch
{
throw $_
}
} `
I was expecting the emulator to stay up and running after a task group from a build started it so it is available for subsequent builds. Based on the build logs I've seen, it appears to "terminate" from time to time and I do not know why. Each build is using its own db name by appending Agent.Id variable so as far as I can tell, multiple builds are not colliding with each other and the text fixture is coded to delete the db upon completion.
When I do get build errors, it's more or less along these lines
[xUnit.net 00:04:15.33] System.AggregateException : One or more errors occurred. (One or more errors occurred. (No connection could be made because the target machine actively refused it)) (The following constructor parameters did not have matching fixture data: SettingsFixture fixture) [xUnit.net 00:04:15.33] ---- System.AggregateException : One or more errors occurred. (No connection could be made because the target machine actively refused it)
as if the emulator failed to start, even though the task group log for Start-MyCosmosDbEmulator shows that it started. Our temporary workaround for now is to requeue the failing build when this type of exception appears and it passes next time around.
I have dockerEE running on the same build server and could certainly try the CosmosDB build emulator task. Will this work when running multiple instances of the same container image? Will the port number be different for each running instance of CosmosDB emulator container and how can I obtain the associated port for a specific container instance? Currently the xunit test projects produce an app.config and appsettings.config files during build. The test fixture is coded to look for 2 configuration keys in it and I only patch the databasename. I'm assuming that we'll keep the code as is instead of using runsettings as long as I can patch the endpoint with the right port.
After reading the CosmosDB emulator build task more thoroughly, it looks like I should be able to access the endpoint just by reading the env variable CosmosDbEmulator.Endpoint. I'll give it a shot and see what happens.
Hi @Mike-Ubezzi-MSFT and @KalyanChanumolu-MSFT , is there detailed documentation for CosmosDB emulator build task so I can determine what task parameters I need to change? I think the instructions on build task assume one azure pipelines agent instance per machine/vm. Unfortunately, that is not our situation and we have only a handful of bare metal servers with multiple pipeline agent instances installed per. The closest info about cosmosdb/docker I can find is here. If we're only using the documentdb endpoint on 8081, it seems to me that I'd have to also change the port parameters here using a build-specific variable to avoid port collision? How about the container name? Is that parameter used to identify the image to pull or the unique container name when running the image? Would the host directory be created when it does not pre-exist? If we're only using document db, what should the api value be?
@tristanbarcelon Thank you for the additional details and providing the script example. I am providing some specifics to make sure you are aware of this information. The Cosmos DB Emulator doc is loaded with information so I want to point out a couple of things. The Emulator has three means for controlling the service based upon either a local installation or a Docker container.
It is also worthwhile to detail the differences between the Emulator and the production service (link).
You should be able to have a single instance of the Emulator, either local install or Docker instance unless the total number of containers exceeds the limits of a single instance. I do not have any data on total number of connection currently supported but if you need to expand to a second instance of the Emulator, the port number is simple to control, as you detailed in a previous comment.
// Connect to the Azure Cosmos Emulator running locally DocumentClient client = new DocumentClient( new Uri("https://localhost:8081"), "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==");
The command-line syntax for CosmosDB.Emulator.exe has an option to run on a port other than the default of 8081 (link):
CosmosDB.Emulator.exe /Port=<port>
Although the Emulator has no scaling capabilities, you could scale out with multiple instances based upon the available compute resources of the host this is being run from.
And running from Docker, ports can be controlled with the start command:
md %LOCALAPPDATA%\CosmosDBEmulator\bind-mount
docker run --name azure-cosmosdb-emulator --memory 2GB --mount "type=bind,source=%LOCALAPPDATA%\CosmosDBEmulator\bind-mount,destination=C:\CosmosDB.Emulator\bind-mount" --interactive --tty -p 8081:8081 -p 8900:8900 -p 8901:8901 -p 8902:8902 -p 10250:10250 -p 10251:10251 -p 10252:10252 -p 10253:10253 -p 10254:10254 -p 10255:10255 -p 10256:10256 -p 10350:10350 microsoft/azure-cosmosdb-emulator
How many containers are required at any given time, to run all your build tasks and what is the consistency level across all those collections? The response to the number of containers should determine the number of Emulator instances you need. The consistency level could be the issue here but unsure what is required for your solution?
@tristanbarcelon Its been a while we heard from you. We will proceed to close this issue now. If there are further questions regarding this matter, please comment and we will gladly continue the discussion.
Hi @KalyanChanumolu-MSFT and @Mike-Ubezzi-MSFT . Can we continue this debug process? I happened to be on vacation so did not see the replies. Based on the Powershell script I provided, it is using the set up step in the block. $EmulatorPSModulePath is assigned via:
[string] $EmulatorPSModulePath = Join-Path -Path $CosmosDBInstallation.InstallLocation -ChildPath 'PSModules\Microsoft.Azure.CosmosDB.Emulator'
and it contains the same value from set up step $env:ProgramFiles\Azure Cosmos DB Emulator\PSModules\Microsoft.Azure.CosmosDB.Emulator
At build time, this is exactly the branch of code which executes since I have a slightly different verbose message if emulator was controlled via built-in PSModule or by Start-Process.
else
{
Import-Module $EmulatorPSModulePath -Verbose:$false
We have 12 build agent instances on this server and, at most, only 2 of them will be concurrently building and accessing the emulator since 2 builds are simultaneously queued upon completion of a pull request.
I just examined recent builds of this repo which uses CosmosDB emulator. Out of the roughly 59 builds since June, 16 of them have failed. A large majority of those failures produce an System.Net.Sockets.SocketException message No connection could be made because the target machine actively refused it from the unit tests. Sometimes, this error is encountered immediately upon running the first unit test while sometimes it fails further down the list of unit tests. In all of these failing cases, CosmosDB managed to start successfully. Would it be possible to meet via teams and show you what we're seeing during build or share our unit test fixture code?
As far as container/partitioncount is concerned, I am setting it to 100 and it seems to be within limits defined here. Per documentation, we should have been receiving a ServiceUnavailable exception if tests exceeded the partition count.
Is it possible to use the Azure CosmosDB Emulator task in our on-premise build server with multiple build agent instances? Is the task able to dynamically set the port in $(CosmosDBEmulator.Endpoint) variable or is it always fixed to 8081?
@tristanbarcelon You can use the following argument with the Emulator command line to start the emulator on a different port (link) when running localhost or running on a local network (link).
DirectPorts | Specifies the ports to use for direct connectivity. Defaults are 10251,10252,10253,10254. | CosmosDB.Emulator.exe /DirectPorts: |
---|
When using DevOps agent, the Emulator runs on Docker for Windows (the agent downloads the container and runs on Docker) and the following command is passed (link).
md %LOCALAPPDATA%\CosmosDBEmulator\bind-mount
docker run --name azure-cosmosdb-emulator --memory 2GB --mount "type=bind,source=%LOCALAPPDATA%\CosmosDBEmulator\bind-mount,destination=C:\CosmosDB.Emulator\bind-mount" --interactive --tty -p 8081:8081 -p 8900:8900 -p 8901:8901 -p 8902:8902 -p 10250:10250 -p 10251:10251 -p 10252:10252 -p 10253:10253 -p 10254:10254 -p 10255:10255 -p 10256:10256 -p 10350:10350 microsoft/azure-cosmosdb-emulator
I highly suggest you reach out to the Cosmos DB (AskCosmosDB) and bring this to their attention as both a product request and a limitation of the current service. For technical support, please open a Support Request. If you don't have an Azure Support Plan (link), please send me your Azure Subscription ID (AzCommunity) and I will send you instructions to have this handled through the correct channel.
Please let us know if you require additional assistance with the documentation as this channel is designed to address these types of issues, as well as direct you to the appropriate resource if not.
P.S. You are not the first person having issues with the DevOps build agents. My suggestion is to run the Emulator on a local network.
Inside one of our Azure Devops build definitions, we have the following steps:
We use on-premise bare metal build servers and have installed Azure CosmosDB emulator on it. There are 12-16 instances of Azure Devops build agent running per server and this is by design to minimize the number of software installations we have to maintain.
Expected behavior
Once an instance of Azure CosmosDB emulator is running, we expect it stay up across multiple builds jobs on the same build server.
Thus, if build 12345 job is queued first and starts the emulator, another build 12346 that is queued should be able to find it in a running state instead of in a stopped state.
Likewise, I expected the build which started the emulator instance to finish with the emulator still running and available for subsequent builds.
Actual behavior
From time to time, the unit test dlls will fail connecting to the local cosmosdb emulator. Sometimes, it will fail near the end of the tests by exhibiting slow operations. The build definition is not checking the emulator status in step 3. Instead, it is just calling Start-CosmosDBEmulator.
Is there another way to run cosmosdb in a persistent way similar to a windows service? There is an Azure Devops task for running CosmosDB emulator as a container and we may try that but the question remains the same. Can we keep the container running consistently and have it be accessible by multiple builds/unit tests concurrently? I prefer not to use that task because I'd need to bind a host folder and install certificates each time the container starts.