HodorNV / ALOps

ALOps
59 stars 24 forks source link

Concurrency issues when running multiple Self-Hosted Agents in parallel #665

Closed b-zijlstra closed 12 months ago

b-zijlstra commented 1 year ago

Scenario I have an Agent Pool containing 6 Self-Hosted Agents. All 6 agents are on the same virtual machine. I keep running into issues when agents are running ALOPS concurrently.

The issue During concurrent runs with the ALOpsAppCompiler@2 task, I get output like this:

...
*** Platform: c:\***artifacts.cache\onprem\21.1.48363.48638\platform
*** Localisation: c:\***artifacts.cache\onprem\21.1.48363.48638\de
*** VSIX Path: C:\***artifacts.cache\onprem\21.1.48363.48638\platform\ModernDev\program files\Microsoft Dynamics NAV\210\AL Development Environment\ALLanguage.vsix
*** ALC Path: C:\***artifacts.cache\onprem\21.1.48363.48638\VSIX\extension\bin\win32\alc.exe
*** Nav.CodeAnalysis Path: C:\***artifacts.cache\onprem\21.1.48363.48638\VSIX\extension\bin\Microsoft.Dynamics.Nav.CodeAnalysis.dll
*** Import Module
*** Get Localised Apps [130]
*** Get Platform Apps [130]
...
##[warning]Exception calling ".ctor" with "2" argument(s): "The process cannot access the file 'C:\***artifacts.cache\onprem\21.1.48363.48638\platform\Applications\BaseApp\Source\Microsoft_Base Application.app' because it is being used by another process.")
...
##[warning]Exception calling ".ctor" with "2" argument(s): "The process cannot access the file 'C:\***artifacts.cache\onprem\21.1.48363.48638\platform\Applications\BaseApp\Source\Microsoft_Czech language (Czechia).app' because it is being used by another process.")
...
##[warning]Exception calling ".ctor" with "2" argument(s): "The process cannot access the file 'C:\***artifacts.cache\onprem\21.1.48363.48638\platform\Applications\Intrastat\Source\Microsoft_Intrastat Core.app' because it is being used by another process.")
...
##[warning]Exception calling ".ctor" with "2" argument(s): "The process cannot access the file 'C:\***artifacts.cache\onprem\21.1.48363.48638\platform\Applications\microsoftuniversalprint\source\Microsoft_Universal Print Integration.app' because it is being used by another process.")
...
*** Retained Platform Apps [7]
*** Loaded [137] Apps
...

Down the line the compilation will fail, because it's likely that I am using one of these apps as a dependency (e.g Base App).

Expected behavior Surely I am not the only one that is running Pipelines in parallel. Am I pushing my luck by placing all my agents on the same VM? I could create 6 separate VMs to each host their own agent, but I'd prefer to maintain only a single VM.

Additional context Earlier I was having issues with Mutex exceptions during the Download-Artifacts step (https://github.com/microsoft/navcontainerhelper/issues/3164). I resolved that issue by setting all agents to Log On As 'Local System'. Is there any other configurational change that I can make to avoid concurrency issues? Can I somehow set the bcartifacts.cache folder to use different locations depending on the $(Agent.Id)?

waldo1001 commented 1 year ago

Here is your issue:

I have an Agent Pool containing 6 Self-Hosted Agents. All 6 agents are on the same virtual machine

You can surely run pipelines in parallel - but you're indeed pushing it by placing all agents on one machine VM. It's bad practice. Agents will use each other's resources. These cause challenges on:

Since it's virtual anyway, just split the machines in separate virtual machines. Is that possible for you?

That said - we checked, and we think we are able to solve this specific error, as it's not necessary to lock the files. We'll do that, but we also expect you'll run into other issues down the road.. .

b-zijlstra commented 1 year ago

Thank you; Understood. I've created separate VMs with one agent each and that works well. It'll create a little extra overhead, but that's manageable for us. I'd argue that multiple agents on one VM should run fine as long as you can contain all file (write) operations within the agent work folders. Obviously that's not easy when working with docker containers.

To prevent these issues down the road we'll keep the VMs separate.

waldo1001 commented 1 year ago

We try to speed up pipelines with caching, so definitely not everything is isolated in its own working folder.. artifact-caching is the best example .. 🤷‍♂️

waldo1001 commented 1 year ago

So - looking into the actual error - that happens to be one we can fix. So we'll do that, although we still don't recommend running multiple agents on one VM.

AdminHodor commented 12 months ago

Dear @b-zijlstra ,

Please try our latest release v1.459, a fix has been made that should improve concurrency.

Kind regards

b-zijlstra commented 12 months ago

Confirmed. I was able to run 6 ALOps App Compiler tasks in parallel on a single VM without issues (all using the same BC artifacts).

waldo1001 commented 12 months ago

For now ;-).

You can report issues about that - we'll see if we can do anything about it - but in general .. there might be issues 🤷‍♂️.