d365collaborative / d365fo.tools

Tools used for Dynamics 365 Finance and Operations
MIT License
246 stars 101 forks source link

Start-D365Environment does not throw exception on failed start of service(s) #606

Closed VincentVerweij closed 2 years ago

VincentVerweij commented 2 years ago

Hi,

I was testing something out here. I did add the command Enable-D365Exception. Then I used Start-D365Environment -ShowOriginalProgress I received the following output:

WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...
WARNING: Waiting for service 'Management Reporter 2012 Process Service (MR2012ProcessService)' to start...

Server             DisplayName                                                  Status     StartType  Name
------             -----------                                                  ------     ---------  ----
MyComputer         Microsoft Dynamics 365 Unified Operations: Batch Manageme... Stopped    Automatic  DynamicsAxBatch
MyComputer         Microsoft Dynamics 365 Unified Operations: Data Import Ex... Running    Automatic  Microsoft.Dynamics....
MyComputer         Management Reporter 2012 Process Service                     Running    Automatic  MR2012ProcessService
MyComputer         World Wide Web Publishing Service                            Running    Automatic  w3svc

As can be seen the batch service is at status Stopped. I would expect that it throws some kind of error if it cannot start the service. Like when you start the service yourself, you get a warning:

image

I am resolving the error why the batch service cannot be started (due to an ISV and duplicate metadata). But that is a different issue than the start command not showing me any error. In terms of using this in a DevOps pipeline, it will simply report a successful execution, although it actually failed.

VincentVerweij commented 2 years ago

I checked out the code that is used to start the services. So it calls this: $temp = Get-Service -ComputerName $server -Name $Services -ErrorAction SilentlyContinue (line 134) I did replace $Services with DynamicsAxBatch for testing purposes. After getting the appropriate service, it pipes this command (where for me the issue lies): $temp | Start-Service -ErrorAction SilentlyContinue -WarningAction $warningActionValue (line 140)

If I remove the -ErrorAction SilentlyContinue then I get the following error:

WARNING: Waiting for service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)' to start...
Start-Service : Failed to start service 'Microsoft Dynamics 365 Unified Operations: Batch Management Service (DynamicsAxBatch)'.
At line:1 char:9
+ $temp | Start-Service
+         ~~~~~~~~~~~~~
    + CategoryInfo          : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service], ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

Now it would be wonderful if this could be thrown if Enable-D365Exception is set for this session. But I have no clue if that's a good idea, and secondly I am not a wizard in PowerShell 😅 So I would love to contribute something here, but I will probably mess things up.

Splaxi commented 2 years ago

Hi,

Thanks for taking the time to report this.

You have struck a little nerve around the services and how we handle them. The cmdlets were some of the first to be made, but the Enable-D365Expection was added several years later.

You are the first to consider having the Start-D365Environment supporting the Enable-D365Expection.

Could we step back a bit?

Why is your batch service failing to start in the first place? Is it an expected behavior that you are trying to mitigate / catch?

I'm a bit split in whether or not we should implement support for the Enable-D365Expection. The current implementation of the cmdlet(s) are already complex - getting them to work like they have for several years, while support Enable-D365Expection can prove pretty difficult. Does the effort match the gains?

Could a simplified cmdlet, with support for Enable-D365Expection be a better option for you? What about you using the Get-D365Environment -Batch and testing the status? And failing yourself?

VincentVerweij commented 2 years ago

The customer I am working with is very security minded. So package deployments cannot be done via LCS (on devboxes, Tier2 is another story as they are self-service). Ok that alone is already enough reason to NOT adjust the tools. But what I am trying to achieve here is a lot of automation.

They have numerous ISVs which need to be updated regularly. Now it is me who installs this manually. I already created something that uses a pipeline in Azure DevOps which is triggered after they upload a deployable package into a drop folder. The pipeline will unzip the deployable package, and will take the models in it and upload them into the Metadata folder (linked to PackagesLocalDirectory). Now sometimes we have duplicate objects in there, due to customizations that were made by the customer in AX2012, but are now moved to an ISV. When we start the services, the batch service will fail to start. Then we know that we have to look into the Event Viewer in the Metadata folder to see which objects are supposed to be removed from the customizations.

So that being said, I would like to automate that last step too. By that I mean, after the package has been loaded into the Metadata folder, I would start the services. If I use the tools for this, the pipeline would not fail as the tools do not throw an error.

I can imagine that it will be a huge effort to include the exception handling into this function, especially if you say that it was one of the first cmdlets. I would also understand, based on my reasoning above that you would say, sorry that's not how you should be working with this in the first place. But then again, wouldn't it be great to let people know if one of the services doesn't start? I can't imagine that no one ever came up with this way of thinking. It's not unreasonable to think that it would fail the pipeline when being used.

Splaxi commented 2 years ago

Nobody ever complained, be that because they don't need it or be it that they found other ways to achieve the same 🤷‍♂️

With all of that said - I get your point, it would make sense to have the cmdlet fail if a service isn't starting as expected. What we could do:

And if people feel the v2 version is superior, we can deprecate the old cmdlet with 6-12 months notice and then make an alias for the v2.

VincentVerweij commented 2 years ago

As I said I am not a wizard in PowerShell like you are, but I'll be glad to contribute. Could you point me out where to check for the Exception thing that is handled by the cmdlet Enable-D365Expection?

The best would obviously be that they don't see any difference, but if they have the Enable-D365Expection cmdlet and it suddenly fails for whatsoever reason than it's a breaking change right? Just asking this to know for sure whether bumping to v2 is required or not.

Splaxi commented 2 years ago

What Enable-D365Expection does it so force configure the default value for all cmdlets to have the -EnableException parameter configured.

It is born from this: https://github.com/d365collaborative/d365fo.tools/issues/265

A small example:

function Test-Call {
    param([switch] $EnableException) 
    $messageString = "Something went wrong while <c='em'>removing the Database."
    Write-PSFMessage -Level Host -Message $messageString 
    Stop-PSFFunction -Message "Stopping because of errors." -Exception $([System.Exception]::new($($messageString -replace '<[^>]+>', ''))) -EnableException $EnableException
}

As Enable-D365Expection fixes the "implicit inheritance" - you don't need to handle the -EnableException $EnableException part of Stop-PSFFunction.

Regarding the v2 or not - that is in fact the risk. I want to avoid people having breaking changes for cmdlet as old and "core" as the Start-D365Environment.

Looking at the code - if you did some testing with

Start-D365Environment -All:$false

vs

Enable-D365Expection
Start-D365Environment -All:$false

I can see that the cmdlet in fact supports the exception logic of the Stop-PSFFunction - but only for validation of the parameters.

So the learning is:

Do a rough draft and let us look at it. We have a lot of automation in place to handle creating documentation (md) and tests. So you do a ps1 file with comment based help inside - and we'll fix the rest.

VincentVerweij commented 2 years ago

The fix looks as easy as surrounding Start-Service with a try-catch like this:

try {
    $temp | Start-Service -WarningAction $warningActionValue
}
catch {
    Write-PSFMessage -Level Host -Message "Something went wrong while starting the service(s) on computer '$($server)'" -Exception $PSItem.Exception
    Stop-PSFFunction -Message "Stopping because of errors"
    return
}

But I am questioning myself because you are saying:

  • You should avoid handling the ComputerName parameter from the current cmdlet
    • This would make your task easier

Are you saying this because it could throw an error for each computer in the foreach loop and that I then need to capture all of them and log a proper output later on? Or is it because we're handling this inside a loop, and you don't want to stop intermediately? Or maybe something else?

Just to make sure I am doing this right: should I make a new cmdlet with name Start-D365EnvironmentV2.ps1 next to the current cmdlet which is a copy with my try-catch in there?

VincentVerweij commented 2 years ago

I was just thinking further for the loop that I mentioned above. By keeping track of each message and the computer/server that threw the exception, we can log it at the end, like this:

... skipped existing code

$exceptionMessagesFromStartServices = New-Object -TypeName "System.Collections.ArrayList"
$didServiceStartThrow = $false

$Results = foreach ($server in $ComputerName) {

    ... skipped existing code

    try {
        $temp | Start-Service -WarningAction $warningActionValue
    }
    catch {
        $didServiceStartThrow = $true
        $exceptionMessagesFromStartServices.Add(@($server, $PSItem.Exception))
    }

    ... skipped existing code
}

... after service listing result

if ($didServiceStartThrow) {
    foreach($exceptionMessage in $exceptionMessagesFromStartServices) {
        $server = $exceptionMessage[0]
        $exception = $exceptionMessage[1]
        Write-PSFMessage -Level Host -Message "Something went wrong while starting the service(s) on computer '$($server)'" -Exception $exception
    }

    Stop-PSFFunction -Message "Stopping because of errors"
    return
}

If that is what you meant, if not I am happy to hear what you did mean 😄

Splaxi commented 2 years ago

The reason I say you should avoid the ComputerName paramter:

It doesn't add any value these days.

Back in the days we could RDP into Tier2 environments, where the local network could talk with all AOS nodes.

But now the scripts are mostly run directly from the VHD / onebox - or via some DevOps agents, that is essential the same.

It should make the cmdlet less complex, which in favor makes it easier to for us to maintain over time.

The overall idea from your code seems solid.

Yes - you should create a entire new ps1 file, name v2. Having it side-by-side with the original one.

Splaxi commented 2 years ago

This is solved with the upcoming release, where all changes done by @VincentVerweij is compiled into a v2 version of the cmdlet.