microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.02k stars 399 forks source link

[BUG] - [10.0.1816.9590] on-prem cluster install always fails with "NodeType0.Certificates may not be null in this context." #1466

Open MonDeveloper opened 9 months ago

MonDeveloper commented 9 months ago

Describe the bug Using package Microsoft.Azure.ServiceFabric.WindowsServer.10.0.1816.9590 all the install attempts fails with NodeType0.Certificates. From a scratch set of 5 Win2022 servers when we try to use the package Microsoft.Azure.ServiceFabric.WindowsServer.10.0.1816.9590 none installation succeed, neither the easiest unsecure one. From the same scratch set of Win2022 servers, if we use the package Microsoft.Azure.ServiceFabric.WindowsServer.9.1.1833.9590 the same installation succeeded. In both cases the TestConfiguration.ps1 returns true in all the checks Comparing the package provided config, the only relevant change found is the apiVersion, the v10 has "apiVersion": "11-2022" while the v9 has "apiVersion": "05-2020" We didnt find any documentation describing that property ("apiVersion") neither the impact on the validation of the Cluster Config file, neither the schema of the Cluster Config file.

Area/Component: Service Fabric on prem Installation

To Reproduce Steps to reproduce the behavior:

  1. Go to download official packages v9 and v10
  2. Prepare a set of W2022 servers with all the precondition checked
  3. Run the CreateServiceFabricCluster.ps1 found in the official package using the ClusterConfig.Unsecure.MultiMachine.json Cluster Config file provided within the same official package
  4. See error

Expected behavior Installation succeeded in both v9 and v10

Observed behavior: Installation succeeded in v9 Installation failed in v10 with no valuable logging info neither in trace files or in console.

Screenshots the only line in trace file relevant is: 2023/10/03-14:46:40.843,Error,6372,ImageBuilder.FabricDeployer,NodeType0.Certificates may not be null in this context.

Service Fabric Runtime Version: ex: 10.*

Environment:

If this is a regression, which version did it regress from? v9 (and also v8, v7, v6)

Additional context Add any other context about the problem here.


Assignees: /cc @microsoft/service-fabric-triage

flower7434 commented 8 months ago

You may encounter this error if an actor fails to start or is missing. Ensure that the NuGet packages and the .NET version are compatible with the installed SF version. Additionally, this error can occur if the OutputType is mistakenly set to 'Library' in an actor project.

aarondoss commented 7 months ago

Is there any update for this bug?

My development team is having the same issue trying to update to SF 10.0.1949. Reverting back to a previous version (9) works fine. We have upgraded versions several times in the past with no issues. We have verified that all our nuget packages are the correct version for 10.0.1949.