Closed Drakeii closed 1 year ago
Hi @Drakeii
For information The problem occurs when reading the configuration and decrypting sensitive information. There is no reason to have a different configuration on the servers, unless they are not synchronized.
this is about "System" encryption. The key used is the GUID of the domain on which the ADFS Server is registered. The encryption is AES, with special initialization vectors. this encryption is therefore not sensitive to a specific server. once retrieved from the domain the key is stored in the server registry under the HKLM\Software\MFA\MFAID key for cache. you can delete this key in the registry, it will be recreated the next time the MFA service is started.
For information: the TCP flows between the different MFA servers are also encrypted in the same way. This implies that all servers in the ADFS farm must be registered in the same ADDS domain.
What do you mean by "The nodes are geographically far from each other"? MFA services synchronize the configuration in TCP on port 5987, timeouts are 1 minute, mutual authentication is "Windows" based on ADFS/SID authorizations (Local System, Local Admins and Delegation Group) You can also check the registry for the delegation group name HKLM\Software\MFA\DelegatedAdminGroup
As a general rule in the ADFS architecture, multiple servers are positioned for redundancy and for performance (large number of users). but are centralized. In the case of remote establishments such as subsidiaries or partners (Europe, USA, Australia), good practice is to make the "Co-Federation" federation trusts between the different STS. This has the advantage of not putting network lines between the different platforms but using the internet, each establishment has its own ADDS platform. moreover in this case there is no modification to be made on the different current or future relying Parties.
Can you look at the different points indicated. Feel free to delete registry keys and delete config.db and system.db files
Thanks in advance for your feedback.
redhook
Hi @redhook62
Thanks for the descriptive answer.
I removed the key HKLM\Software\MFA\MFAID and restarted the service. I can see that both keys, MFAID as well as DelegatedAdminGroup contain the same value on every node. Still it would not work.
I deleted the files system.db and config.db on the affected nodes and restarted the service, this did not help either.
What I mean by "The nodes are geographically far from each other" is that 3 of the ADFS servers are relatively close to each other, all in EMEA region, and the 2 new ADFS servers are in APJ region, increasing response time between each other. All servers are in the same ADDS, no subsidiaries.
Also, I noticed in the event logs one more event while restarting the service (during start) with Event ID 2000:
Error Initializing WebAuthN Metdata Repository : There was no endpoint listening at net.tcp://localhost:5987/WebAdminService that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. /// Server stack trace: at System.ServiceModel.Channels.ConnectionUpgradeHelper.DecodeFramingFault(ClientFramingDecoder decoder, IConnection connection, Uri via, String contentType, TimeoutHelper& timeoutHelper) at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment
1 preamble, TimeoutHelper& timeoutHelper)
at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper)
at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at Neos.IdentityServer.MultiFactor.IWebAdminServices.HasBLOBPayloadCache()
at Neos.IdentityServer.MultiFactor.Data.WebAdminManagerClient.HasBLOBPayloadCache()
at Neos.IdentityServer.MultiFactor.WebAuthN.Metadata.BaseSystemMetadataRepository.HasBLOBPayloadCache()
at Neos.IdentityServer.MultiFactor.WebAuthN.Metadata.MDSMetadataRepository.
Yes, this is corrected in the May release, for the weekend.
And as for information in the registry, this will no longer exist in version 4, where you can choose your key... We are working on a configuration management refactoring, because although it works fine to get around the access limitations imposed by Microsoft (it's their product, it's also choices related to code security), but, fruit from your feedback, we have better.
regards
Hi @redhook62
Thanks for the response and upcoming news. Do I understand correctly, we should wait for this new release to see if this fixes the current issue with the new nodes?
Kind regards
In some months...
Hi, @Drakeii
can you google Below are some ideas
This still seems to come from your new servers
regards
Hi, We have installed 2 new ADFS servers and joined them to the MFA farm as well. The initial installation would not register them automatically, hence we manually run the command on the new nodes:
Register-MFAComputer -ServerName "servername_to_add" -NoRSAKeyReset
We have also run the commands for Firewall Rules and Private Key ACL on each and every MFA/ADFS node, as stated in the Wiki - by the way, there is a typo in the command for ACL, missing one "c" in "access":
Set-MFAFirewallRules
Update-MFACertificatesAcessControlList
Firewall port is open, as Test-NetConnection -Port 5987 is successful between ADFS servers.
3 of the servers (that were in the farm previously) can retrieve the secret key for the users in MMC. The two new servers, however, show an empty secret key for the users in MMC.
When trying to authenticate, the user GUI returns the error SECURITY ERROR: Invalid Key for User
On the AD FS server, MFA Service logs the following:
Event ID: 0
`(RSAEncryption Decrypt) : Crytographic error for user
The parameter is incorrect.
at System.Security.Cryptography.NCryptNative.DecryptDataOaep(SafeNCryptKeyHandle key, Byte[] data, String hashAlgorithm) at System.Security.Cryptography.RSACng.Decrypt(Byte[] data, RSAEncryptionPadding padding) at Neos.IdentityServer.MultiFactor.RSAEncryption.GetDecryptedKey(Byte[] encryptedBytes, String username) `
Event ID: 666
Error decrypting value for Pass Phrase Encryption : Keyset does not exist
(Same for Administrator Pin, Default Users Pin and Mail Provider Account)What else are we missing here?
Some additional info: We are using Active Directory as storage All nodes are running Windows server 2022 The nodes are geographically far from each other