Open johncrim opened 5 years ago
On Windows cluster I have had no problems with ECC certs. Don't have experience on Linux clusters though.
That's great @juho-hanhimaki - thanks for the info. Since much of the code is the same on Linux and Windows, it should be close to working.
Based on your comment I'll try to provide more evidence that it either works or doesn't work on SF Linux, and where the issue is.
Just to clarify this is the type of certificate in use on my windows cluster:
Don't know if the signature algorithm makes a difference (I assume not) but as I understand SHA256withRSA is commonly used even for certificates with ECC key.
We're using sha256ECDSA as the signature algorithm - again on Linux, I wasn't able to get an ECC cert signed by an ECC cert to work. I am however able to get an RSA cert to work when signed by an ECC cert:
Note that this only works when the cert is referred to by thumbprint, eg this arm template snippet works for SF cluster configuration in Azure with the above certificate:
"resources": [
{
"apiVersion": "2018-02-01",
"type": "Microsoft.ServiceFabric/clusters",
"name": "[parameters('clusterName')]",
"location": "[parameters('clusterLocation')]",
"properties": {
"certificate": {
"thumbprint": "[parameters('serviceFabricCertThumbprint')]",
"x509StoreName": "My"
},
Presumably b/c a thumbprint reference doesn't require validating the signature. However referencing the same cert by common name and issuer thumbprint doesn't work:
"resources": [
{
"apiVersion": "2018-02-01",
"type": "Microsoft.ServiceFabric/clusters",
"name": "[parameters('clusterName')]",
"location": "[parameters('clusterLocation')]",
"properties": {
"certificateCommonNames": {
"commonNames": [
{
"certificateCommonName": "[variables('serviceFabricCertCommonName')]",
"certificateIssuerThumbprint": "[parameters('caCertThumbprint')]"
}
],
"x509StoreName": "My"
},
So perhaps it's a problem with signature validation. I initially opened this bug b/c I was not able to get an ECC cert signed by an ECC cert to work.
I ran into this when trying to take a new certificate into use with a Windows cluster.
E.g. running Add-AzServiceFabricClusterCertificate with an elliptic curve certificate starts filling the logs with errors. I was unable to update a test cluster with the new certificate, tried with a couple of clusters. In some cases (at least when creating a new cluster) it seems that the operation eventually is marked as succeeded, but at least the logs keep filling up with errors even in those cases, so I expect it is only partially functional.
Based on the error message, the Service Fabric implementation is calling a method which does not support elliptic curve certificates.
The first error in Service Fabric Admin Event Log:
AclCert error with FindByThumbprint,
Documentation about the method: https://docs.microsoft.com/en-us/dotnet/api/system.security.cryptography.x509certificates.x509certificate2.privatekey?view=netframework-4.8 "Currently this property supports only RSA or DSA keys"
Later errors (repeating once per minute): CryptAcquireCertificatePrivateKey failed. Error:0x80090014
Failed to get the Certificate's private key. Thumbprint:
Failed to get private key file. x509FindValue:
Certificate info: Signature algorithm: sha384ECDSA Signature hash algorithm: sha384 Public key: ECC (256 bits) Public key parameters: ECDSA_P256
Cluster version: 6.5.676.9590 Cluster VMs: "2016-Datacenter-with-Containers"
Thanks @nokjuh . That looks like a bug in System.Fabric.FabricDeployer.CreateorUpdateOperation.AclCert
. The implementation should be changed to call ECDsaCertificateExtensions.GetECDsaPrivateKey when the cert has an EC private key.
This is probably not the only error, however. It would be great if the SF team added EC certs to their test cases, since EC certs are widely regarded as better, arguably more secure, and more efficient than RSA certs.
The (managed) SF runtime is currently on .net 4.5.1. There are plans to move forward, but as you can expect, this is a big change. We are aware of this (and other issues), and the fixes require a newer runtime version.
As for ECC vs RSA, the requests for the former just aren't that frequent. (And ahem; yes, it's patched now, but the 'more secure' argument lost a bit of weight.)
Thanks for the reply @dragav - we're using .NET Core 3.1.0 on Linux. I get that you have backward compat requirements, but I don't think that should limit SF users from using newer libraries.
ECC is used for https by many of the biggest domains (eg google and wikipedia). The bug you point out was a legit issue, but it was in the Windows implementation, not an issue with ECC itself. The main advantage of ECC is much smaller keys for the same level of security, which allows both ends of the connection to encrypt and decrypt more efficiently.
I'm not trying to be antagonistic, but there are many articles by people more qualified than I pointing out the rationale for choosing ECC over RSA.
Thank you, @johncrim; my point (conveyed poorly) was that the SF product itself is quite a bit behind in terms of .net runtime. Since that's where the certificate management code lies, the fix requires upgrading our own product first. SF users are, of course, free to use the runtime of their choice, and not depend on SF's. If this is an application certificate, you can work around the limitations of the runtime by ACLing EC certificates in a SetupEntryPoint (running elevated.)
With regards to RSA vs ECC, I wanted to highlight that a crypto solution is as secure as its weakest point, and that typically implementations which stood the test of time are favored over unproven/relatively recent ones. I was not trying to contest the (inarguable) advantages of EC keys/certificates..
Thanks @dragav - good points. We are in fact using EC certs for our services, and I agree that the RSA limitation for cluster certs is not critical. It was however difficult to figure out that EC certs don't work (particularly on Linux).
I wasted a bunch of time when I initially tried to get EC certs working, without any clear error messages or log messages (that I could find). So I opened this issue both as a request to add support for EC certs, and to provide some documentation for others.
At least it would be very useful to have the requirements for the certificate listed at https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-security#x509-certificates-and-service-fabric and similar locations.
Otherwise I'm basically telling our IT team "this certificate you ordered did not work, can you order a different one" without being able to specify exactly what kind. I guess it needs to have RSA signature algorithm, but maybe the public key can be ECC? And that only after trying the certificate and running into weird problems where the certificates seem to almost work but not quite, without getting clear errors.
Thank you both, @johncrim and @nokjuh, for sharing your findings and for the suggestions. While support for ECC certs will take a while, we do acknowledge our documentation needs improvement. You can expect clarifications to be published soon.
Hello. Would you be able to give any updates on the ECDSA cluster certificate support?
The ECC certificates seem to be increasingly recommended as time goes by, and are defaults at least where our IT orders our certificates.
Was the documentation clarified? I did not notice anything about RSA vs. ECC at least in https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-security#x509-certificates-and-service-fabric .
Supporting ECC certificates requires us to move to a newer .net runtime version. It's a sizeable task with a relatively low priority in the context of the rest of our backlog.
We've since expanded public documentation around certificate-based authentication and certificate management, respectively. We still don't have a clear list of requirements/supported scenarios for certificates; this troubleshooting guide obliquely mentions that SF only supports CAPI1 certificates. This implicitly rules out ECC certs.
I would also like to add that importing an ECDSA certificate from Azure Key Vault does not work for Azure Web apps, the same certificate signed from the same machine using RSA however does.
The strange thing is the same ECDSA signed certificate from Azure Key Vault works for the Azure's Application Gateway.
Why would the same certificate work on Azure's Application Gateway and not Azure Web apps?
I see others have reported this back in 2019 - https://github.com/shibayan/appservice-acmebot/issues/114
Based on some testing (on Ubuntu 16.04 in Azure ServiceFabric), it appears that Service Fabric doesn't support using ECDSA cluster certificates - it seems to support RSA cluster certificates only. As teams are moving to more ECC certs this would be nice to handle, or at least to provide clear documentation or error messages if the wrong kind of certs are used. I haven't been able to find any documentation on this, and I don't know where to look to read the code RE cert handling.
.NET Core supports ECDSA certs, as does SSL. With openssl, if you use
openssl pkey
instead ofopenssl rsa
the operation works on both ECC and RSA certs.Azure KeyVault "kind of" supports ECC certs today - the API support is there, thought the portal UI and PowerShell support isn't there yet. But that's coming, and Service Fabric should be able to deal with those certs.
Obviously if ECC is supported, it needs to be supported across all clients.