microsoftgraph / microsoft-graph-comms-samples

Microsoft Graph Communications Samples
MIT License
211 stars 237 forks source link

EchoBot Instance VM, service FILES DELETE #654

Closed jamil-z closed 1 year ago

jamil-z commented 1 year ago
name about
Issue with EchoBot Instance VM, service FILES DELETE It's about the EchoBot service running on the instances, where files are being deleted.

Describe the issue We are already running EchoBot, and the service is running on VM instances. When the project is built, there are 175 files that are added for the service execution. However, after some time or when the instance is restarted, the number of files decreases to 82. We want to resolve this issue where more than half of the files are being deleted.

Expected behavior We want the service to continue running and no files to be deleted even after a certain period of time or when the instances are restarted.

Additional context We suspect that in the scripts, there might be an error where, upon detecting that the service is not running, it attempts to rebuild or copy the files but fails to complete this action successfully. When the files already exist, it should not attempt to copy anything but simply execute the service.

brwilkinson commented 1 year ago

Hi @jamil-z

The sample uses Desired State Configuration (DSC) for all setup on the Virtual Machine ScaleSets (VMSS).

The DSC configuration is deployed via the DSC extension in the Bicep Template.

The raw configuration is here and there is a specific custom DSC resource that copies the App Files.

https://github.com/microsoftgraph/microsoft-graph-comms-samples/blob/master/Samples/PublicSamples/EchoBot/ADF/ext-DSC/DSC-BotServers.ps1#L393C1-L410C10

That custom DSC Resource is here for deploying the App Files from the Build pipeline and it allows for continuous integration for new builds.

https://github.com/brwilkinson/AppReleaseDSC

All code is in this single file.

https://github.com/brwilkinson/AppReleaseDSC/blob/main/AppReleaseDSC.psm1

DSC is going to always run the first time... then there is a configuration setting to say if it should run later on to autocorrect.

That specific setting is here:

https://github.com/microsoftgraph/microsoft-graph-comms-samples/blob/master/Samples/PublicSamples/EchoBot/ADF/ext-CD/BOT-ConfigurationData.psd1#L20

            # given this is for a lab and load test, just always pull down the latest App config
            DSCConfigurationMode        = 'ApplyAndAutoCorrect'

Which actually gets used here in the DSC on the ConfigurationMode.

https://github.com/microsoftgraph/microsoft-graph-comms-samples/blob/master/Samples/PublicSamples/EchoBot/ADF/ext-DSC/DSC-BotServers.ps1#L95C1-L101C10

        LocalConfigurationManager
        {
            ActionAfterReboot    = 'ContinueConfiguration'
            ConfigurationMode    = iif $node.DSCConfigurationMode $node.DSCConfigurationMode 'ApplyAndMonitor'
            RebootNodeIfNeeded   = $True
            AllowModuleOverWrite = $true
        }

You can comment out that setting in the ConfigurationData, then deploy again and it will be set to ApplyAndMonitor instead of ApplyAndAutoCorrect or change the setting yourself to whatever you prefer.

It was set to autocorrect, since if you rerun the build pipeline, it automatically pulls down the latest build basically CI/CD.

So it's hard to say what your exact issue is or what you are experiencing, however the custom DSC resource above uses AZCopy to do all files copies for the App Files and we log everything to the AzCopy Logs on the Filesystem.

One thing, since you are stopping the service, when DSC runs which is default every 15 minutes, it's going to see that the machine is not in the desired state and it's going to run ALL of the DSC configuration again. So definitely if you plan to stop the service flip to ApplyAndMonitor because it's just going to restart the service anyway.

However you won't get new builds pulled down... But also know without that change, as I just mentioned above, DSC is going to restart the service every 15 minutes anyway, because it's in the DSC configuration.

So let me know how you go... the answer could be in the AZCopy logs, however I think it's best to tweak DSC for your use case.

It could be something unrelated to this, however this appears like the most likely scenario.

brwilkinson commented 1 year ago

one thing... in DSC you run the following to see if the machine is in the desired state

Test-DSCConfiguration

It will show resources in the Desired and also the Non desired state. If there is a resource not in the desired state, then you need to figure out why or what changed to make it that way. Again if it's an issue with copying files, the answer will be in the AzCopy Logs.

brwilkinson commented 1 year ago

You can also see the history of ALL DSC attempts to run.

Get-DscConfigurationStatus -All

You can see if status was in success or failure.

brwilkinson commented 1 year ago

@jamil-z please keep us posted if you were able to resolve your issue on this one?

jamil-z commented 1 year ago

@brwilkinson Thank you for the answers. Yes, we were able to solve the problem, but we took a different approach due to the required changes. We are adding the file generated by default, CurrentBuild.txt, to avoid the files being deleted when making the copy.