microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 399 forks source link

Event Log: Failed to open store '' at LocalMachine: E_INVALIDARG #681

Open nsoderberg opened 6 years ago

nsoderberg commented 6 years ago

Directly after installing the Service Fabric SDK and running DevClusterSetup.ps1 to setup a one node secure cluster, I keep getting these event logs, once a minute:

Does this mean SF is trying to open a certificate/key store named empty string? I can understand why that fails, but why is it happening and what can I do to fix it?

If I look in the FabricHostSettings.xml it says: <Parameter Name="ClientAuthX509StoreName" Value="My" /> which leads me to believe it should try to use the "my" store, but this whole certificate thing is not my main area of expertise so it might not have anything to do with the error.

Any pointers or ideas appreciated!

guibirow commented 6 years ago

This docs show all steps required to setup a secure cluster:

https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-windows-cluster-x509-security#optional-create-a-self-signed-certificate

Looks like you either:

nsoderberg commented 6 years ago

Thanks for your suggestions. However the certificate creation and the usage of it is all handled in the DevClusterSetup.ps1 powershell script in the Service Fabric SDK. The certificate is created and looks fine, I can find it with the correct thumbprint in the cert store. Do you know specifically which parameters point out the store to use?

masnider commented 6 years ago

It'd be a new manifestation, but can you see if any of the description or mitigations in https://github.com/Azure/service-fabric-issues/issues/1056 work out for you? This feels similar in that the local store isn't getting set up correctly or something else is wrong.

dragav commented 6 years ago

@nsoderberg you are correct, the certificate is generated as part of setting up the cluster. It appears that in your case, the certificate creation failed silently. We haven't been able to reproduce this, but the suspicion is that the access to the private key store on your machine is restricted. The mitigations pointed to by @masnider in microsoft/service-fabric-issues#1056 may apply - specifically those dealing with permissions.

nsoderberg commented 6 years ago

Thanks for your help! However I have alread tried the suggestions in microsoft/service-fabric-issues#1056, and have already set the everyone permissions. I have replicated this behaviour a couple of times now, I have reinstalled my machine from scratch three times the last two weeks (with both swedish and english windows 10) and each time get the same behaviour. I also have colleagues with the exact same logmessages, but I also have colleagues who do not see it.

I am no c++ programmer, but when searching for the problem I stumbled upon the source code for Service Fabric:-) To me it looks like the logging originates from here https://github.com/Microsoft/service-fabric/blob/93b7d25697bf0352b49f1cb998f5d1983e11ceef/src/prod/src/Common/CryptoUtility.cpp#L1193 and if that is the case the certificateStoreName seems to be an empty string, and that just seems wrong to me. But perhaps that could also be an effect of some permissions being wrong, or something completely else being wrong earlier in the callchain.

A couple of us at the office are having a bunch of issues related to certificates when running our application locally, and my suspicion is that this might have something to do with it. But I might be on a wild goose chase as well:)

dragav commented 5 years ago

@nsoderberg thank you for the details; indeed, that is where the failure is thrown, but the reason the store name is empty is that our script to generate a cluster certificate fails - silently. The output of that script is invalid and that results in an invalid cluster configuration.

I have seen another instance of this issue, but it's not clear what differentiates the successful installations from failed ones - especially on clean OS images.

I am sorry for the hassle, we're investigating this with high prio.

nsoderberg commented 5 years ago

@dragav Good to hear you are looking in to it. Let us know if you want logs or something else to assist your investigation!