microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.02k stars 399 forks source link

[BUG] - Unable to create a cluster with common name and AKV Extension for Windows (only thumbprint works) #1423

Open jrmcdona opened 1 year ago

jrmcdona commented 1 year ago

Describe the bug Using Area/Component: Please mention area or component in Service Fabric where issue was found. Ex: security, monitoring, placement or resource governance, Reliable services, Actors, programming models,SDK,.etc.

To Reproduce Steps to reproduce the behavior:

  1. Set up AKV VM Extension https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/key-vault-windows?tabs=version3
  2. Using common name I get a 401 when creating the cluster
  3. Here are two snippets from template, one with common name which fails and then another with thumbprint which succeeds in cluster creation

Fails:

 {
      "apiVersion": "[variables('serviceFabricApiVersion')]",
      "type": "Microsoft.ServiceFabric/clusters",
      "name": "[parameters('clusterName')]",
      "location": "[parameters('clusterLocation')]",
      "dependsOn": [
        "[concat('Microsoft.Storage/storageAccounts/', variables('supportLogStorageAccountName'))]"
      ],
      "properties": {
        "certificateCommonNames": {
          "commonNames": [
            {
              "certificateCommonName": "[parameters('certificateCommonName')]"
              "certificateIssuerThumbprint": "[parameters('certificateIssuerThumbprint')]"
            }
          ],
          "x509StoreName": "[parameters('certificateStoreValue')]"
        },

Succeeds:

{
      "apiVersion": "[variables('serviceFabricApiVersion')]",
      "type": "Microsoft.ServiceFabric/clusters",
      "name": "[parameters('clusterName')]",
      "location": "[parameters('clusterLocation')]",
      "dependsOn": [
        "[concat('Microsoft.Storage/storageAccounts/', variables('supportLogStorageAccountName'))]"
      ],
      "properties": {
        "certificateCommonNames": {
        "certificate": {
          "thumbprint": "[parameters('certificateIssuerThumbprint')]",
          "x509StoreName": "[parameters('certificateStoreValue')]"
        },

Expected behavior A clear and concise description of what you expected to happen. Cluster would be created.

Observed behavior:

 **Resource Operation 1:
        Name: nt1vm
        Type: Microsoft.Compute/virtualMachineScaleSets
        Mode: Incremental
        StartTime: 02/13/2023 12:19:00
        EndTime: 02/13/2023 12:22:12
        State: Failed
        Operation: Create
        StatusMessage:

Status: Failed Error: Code: VMExtensionProvisioningError Message: VM has reported a failure when processing extension 'ServiceFabricNodeVmExt-nt1vm'. Error message: "Enable Failed. Exception System.Exception: System.Net.WebException: The remote server returned an error: (401) Unauthorized. at System.Net.HttpWebRequest.GetResponse() at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, X509Certificate2 clientCertificate, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, List`1 clientCertificates, Int32 timeoutInMs)

at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, List`1 clientCertificates, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Core.WrpTopologyService.GetVmExtensionPollResponse(String machineName, VmExtensionPollRequest request, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.GetAgentZipPackages(ITopologyService topologyService, HandlerSettings handlerSettings, String& bootstrapAgentZipFilePath, String& upgradeAgentZipFilePath) in X:\bt\1243498\repo\src\HandlerExe\VMExtensionHandler.cs:line 196 at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.InstallServiceWithRetry(ITopologyService topologyService, HandlerSettings handlerSettings) in X:\bt\1243498\repo\src\HandlerExe\VMExtensionHandler.cs:line 317 at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.Enable() in X:\bt\1243498\repo\src\HandlerExe\VMExtensionHandler.cs:line 112 - Machine: _nt1vm_0"

More information on troubleshooting is available at https://aka.ms/vmextensionwindowstroubleshoot Target: 0 StatusCode: Conflict OperationId: A23F635DAB25BA6C HelpLink: https://aka.ms/ev2/errors/troubleshooting**

Screenshots If applicable, add screenshots to help explain your problem.

Service Fabric Runtime Version: ex: 7.1., 7.2.

Environment: -Azure -Windows 2019 Sku

If this is a regression, which version did it regress from?

Additional context I am setting up auto rotating my certs using https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/key-vault-windows?tabs=version3


Assignees: /cc @microsoft/service-fabric-triage

negberts commented 1 year ago

We are running into the same problem.

Status Message: VM has reported a failure when processing extension 'pimsfpt_ServiceFabricNode' (publisher 'Microsoft.Azure.ServiceFabric' and type 'ServiceFabricNode'). Error message: "Enable Failed. Exception System.Exception: System.Net.WebException: The remote server returned an error: (401) Unauthorized. at System.Net.HttpWebRequest.GetResponse() at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, X509Certificate2 clientCertificate, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, List`1 clientCertificates, Int32 timeoutInMs)

at Microsoft.Azure.ServiceFabric.Extension.Core.RestClient.Invoke(Uri requestUri, String method, String requestBody, List`1 clientCertificates, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Core.WrpTopologyService.GetVmExtensionPollResponse(String machineName, VmExtensionPollRequest request, Int32 timeoutInMs) at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.GetAgentZipPackages(ITopologyService topologyService, HandlerSettings handlerSettings, String& bootstrapAgentZipFilePath, String& upgradeAgentZipFilePath) in X:\bt\1246626\repo\src\HandlerExe\VMExtensionHandler.cs:line 196 at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.InstallServiceWithRetry(ITopologyService topologyService, HandlerSettings handlerSettings) in X:\bt\1246626\repo\src\HandlerExe\VMExtensionHandler.cs:line 317 at Microsoft.Azure.ServiceFabric.Extension.Handler.VMExtensionHandler.Enable() in X:\bt\1246626\repo\src\HandlerExe\VMExtensionHandler.cs:line 112 - Machine: _pimsfpt_0". More information on troubleshooting is available at https://aka.ms/vmextensionwindowstroubleshoot. (Code:VMExtensionProvisioningError)

negberts commented 1 year ago

Followed every step in both these tutorials:

https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-change-cert-thumbprint-to-cn https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-create-cluster-using-cert-cn

tomas0620 commented 6 months ago

Same problem for me :(