Epinova / epinova-dxp-deployment

DXP deployment with Azure DevOps release tasks
MIT License
14 stars 11 forks source link

StackOverflowException in "Deploy nuget package (Optimizely DXP)" #231

Closed caleb-brilliance closed 1 year ago

caleb-brilliance commented 2 years ago

As the title implies, we're running into an error while trying to deploy and the log isn't particularly helpful:

2022-09-01T20:24:31.3512717Z PackageLocation: https://xxxxxxxxxxxx.blob.core.windows.net/deploymentpackages?xxxxxxxxxxxxxxxxxxxx 2022-09-01T20:24:31.3513540Z 2022-09-01T20:24:31.3513699Z 2022-09-01T20:24:31.3613508Z Loaded cms package: xx.cms.app.436.nupkg 2022-09-01T20:24:31.3615417Z 2022-09-01T20:24:31.3615779Z 2022-09-01T20:24:31.3806034Z cms package 'xx.cms.app.436.nupkg' start upload... 2022-09-01T20:24:31.3808929Z 2022-09-01T20:24:31.3809275Z 2022-09-01T20:24:31.4885996Z ##[error]

2022-09-01T20:24:31.4890159Z ##[error]

2022-09-01T20:24:31.4896128Z ##[error]Process is terminated due to StackOverflowException.

2022-09-01T20:24:31.4899849Z ##[error]Process is terminated due to StackOverflowException.

2022-09-01T20:24:34.3409285Z Script finished 2022-09-01T20:24:34.3478366Z ##[section]Finishing: Deploy NuGet Package to Integration

Looking for help either in the form of suggestions for what may be wrong or if there's any way to increase logging so we can potentially debug better on our own.

ovelartelius commented 2 years ago

Hi @caleb-brilliance Hmmm never seen that error message before. Is the info you posted all info from the log? I miss some information in the beginning with info about version etc. Regards Ove

caleb-brilliance commented 2 years ago

@ovelartelius Here's the rest of the log:

2022-09-01T20:24:27.1041275Z ##[section]Starting: Deploy NuGet Package to Integration 2022-09-01T20:24:27.1268320Z ============================================================================== 2022-09-01T20:24:27.1269209Z Task : Deploy nuget package (Optimizely DXP) 2022-09-01T20:24:27.1269993Z Description : Start a deploy of a nuget package to target environment for your DXP project. (Optimizely DXP, former Episerver DXC) 2022-09-01T20:24:27.1270453Z Version : 2.2.20 2022-09-01T20:24:27.1270730Z Author : Ove Lartelius 2022-09-01T20:24:27.1271214Z Help : https://github.com/Epinova/epinova-dxp-deployment/blob/master/documentation/DeployNugetPackage.md 2022-09-01T20:24:27.1271726Z ============================================================================== 2022-09-01T20:24:27.6432638Z Using executable 'powershell.exe' 2022-09-01T20:24:27.6437577Z powershell.exe C:\azagent\A2_work_tasks\DxpDeployNuGetPackage_2bc993e5-c27c-4a24-aeaf-0fc403debc8d\2.2.20\DeployNuGetPackage.ps1 -ClientKey xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -ClientSecret * -ProjectId xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -TargetEnvironment Integration -SourceApp cms -DirectDeploy true -UseMaintenancePage false -DropPath C:\azagent\A2_work\r2\a\drop -Timeout 1800 2022-09-01T20:24:28.5402774Z Inputs: 2022-09-01T20:24:28.5435847Z ClientKey: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 2022-09-01T20:24:28.5445730Z ClientSecret: ** (it is a secret...) 2022-09-01T20:24:28.5459287Z ProjectId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx 2022-09-01T20:24:28.5468630Z TargetEnvironment: Integration 2022-09-01T20:24:28.5481354Z SourceApp: cms 2022-09-01T20:24:28.5504588Z DirectDeploy: True 2022-09-01T20:24:28.5523498Z Warm-up URL: 2022-09-01T20:24:28.5531718Z UseMaintenancePage: False 2022-09-01T20:24:28.5542412Z DropPath: C:\azagent\A2_work\r2\a\drop 2022-09-01T20:24:28.5562985Z Timeout: 1800 2022-09-01T20:24:28.5571357Z ZeroDowntimeMode: 2022-09-01T20:24:29.1956498Z Added C:\azagent\A2_work_tasks\DxpDeployNuGetPackage_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx\2.2.20\ps_modules to env:PSModulePath 2022-09-01T20:24:29.3667266Z EpiCloud [@{Version=1.1.0}] 2022-09-01T20:24:29.3793301Z Name Value 2022-09-01T20:24:29.3797629Z ---- ----- 2022-09-01T20:24:29.3800342Z PSVersion 5.1.17763.1490 2022-09-01T20:24:29.3803518Z PSEdition Desktop 2022-09-01T20:24:29.3806996Z PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...} 2022-09-01T20:24:29.3811611Z BuildVersion 10.0.17763.1490 2022-09-01T20:24:29.3812864Z CLRVersion 4.0.30319.42000 2022-09-01T20:24:29.3815555Z WSManStackVersion 3.0 2022-09-01T20:24:29.3818475Z PSRemotingProtocolVersion 2.3 2022-09-01T20:24:29.3821943Z SerializationVersion 1.1.0.1 2022-09-01T20:24:30.5585632Z ProjectId : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx 2022-09-01T20:24:30.5587372Z ClientKey : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 2022-09-01T20:24:30.5587965Z AuthenticationVerified : True

[ Previous post continues here ]

ovelartelius commented 2 years ago

Hi @caleb-brilliance I can see that it execute the latest version of the task, but for some reason it use Epicloud v1.1. It should use v1.2.... but that should in the other hand not create this error message. If Im not wrong you are using a windows agent. Is it the latest? Or are you using some old windows version agent like 2016 or older?
Are you using any other tasks that is not cross-platform compatiable? If not, you could try to change the agent to ubuntu-latest. Will execute much faster and maybe solve your problem. I will now try to execute some deploys with the windows agent and see if I can gen the same error.

caleb-brilliance commented 2 years ago

You're correct that we're using a windows agent, currently v2.193.0 which appears to be about a year old and v2.210.0 would be the latest non-prerelease. Unfortunately we can't change to the ubuntu agent at this time without numerous changes elsewhere. For more context in case it helps, we're using this deploy step in the Releases section in DevOps as opposed to the Pipelines section. We also have the same release steps in use in another DevOps project, though that one isn't yet upgraded to a .Net Core version of Epi whereas the project at hand is currently in the process of that upgrade.

ovelartelius commented 2 years ago

Hi @caleb-brilliance Which "Agent specification" are you using? image

caleb-brilliance commented 2 years ago

We're using "Run on deployment group" instead of "Run on agent" so the settings are different than in your picture. As stated before though, our windows agent is currently on v2.193.0. It's on our client's server but we might be able to look into updating it if you think that would resolve the issue.

On another note: Yesterday one of our releases in this same pipeline surprisingly managed to succeed without the StackOverflowException. Unfortunately, the next release attempt was back to failing with the same error. No difference in configuration between the previous few failures, the success, and the following failures though.

ovelartelius commented 2 years ago

Hmmm. Which windows server(s)s version are used when you use "Run on deployment group"? Is it more then one server running in the group? Are you running some special tasks that must run on deployment group server agent? Or can you try to run on "run on agent" agent. And using one of the agents in Azure DevOps?

I tried to run on a agent via a customers deployment group and I could not gen the error: ... 022-09-09T15:22:18.8539021Z Loaded cms package: Website.cms.app.20220627.1.nupkg 2022-09-09T15:22:18.8539312Z 2022-09-09T15:22:18.8539927Z 2022-09-09T15:22:18.8676838Z cms package 'Website.cms.app.20220627.1.nupkg' start upload... 2022-09-09T15:22:18.8677179Z 2022-09-09T15:22:18.8677698Z 2022-09-09T15:22:21.0537518Z cms package 'Website.cms.app.20220627.1.nupkg' is uploaded. ...

The agent was v2.174.2 on a Windows Server 2022. So it was older then yours.

caleb-brilliance commented 2 years ago

We're using Windows Server 2019 Standard. I managed to recreate our steps using an azure agent as you suggested and that seems to be working for now, but it does change some things. I'll have to check with the client if there's any reason to be opposed to keeping it switched over. That said, even so, if you manage to recreate the issue for Windows and figure it out, I'd still like to know. The fact that it succeeded once yesterday means we must have it configured right so it's a matter of determining what's causing the StackOverflowException most of the time. Thanks for the help thus far.

ovelartelius commented 2 years ago

Hi @caleb-brilliance. How big is the NuGet package file? When a agent fails and get this StackOverflowException... can you see in the paas portal that a deploy has started? If so, do you have any loging from PAAS portal that could give some more enlightenment?

caleb-brilliance commented 2 years ago

The most recent package was ~160 MB. When the StackOverflowException occurs, it is before anything gets started in the paas portal so unfortunately there's no additional logging there either.

ovelartelius commented 2 years ago

Thx @caleb-brilliance Ok. I have added support for Run verbose. image Is it possible for you to rerun deploy task that get this StackOverflowExecption with RunVerbose = true?. It would be interesting what EpiCloud logic return in there Verbose loging.

caleb-brilliance commented 2 years ago

@ovelartelius Thanks for that. The ubuntu agent version is working for now and we're a little busy so I can't get to it immediately, but I'll definitely let you know if we switch back or if I get time to test it again as an aside.