Closed pki-bot closed 4 years ago
Comment from mharmsen (@mharmsen) at 2016-09-09 03:32:21
Per PKI Bug Council of 09/08/2016: 10.4 ("critical")
Comment from ftweedal (@frasertweedale) at 2016-09-13 14:07:25
The issue is a bit different from 1702 - this time it is the DS restart causing LDAPProfileSubsystem to drop all its profiles and reload. At the time ipa-replica-prepare tries to issue the cert, profiles are still being (re)loaded and caIPAserviceCert hcryptomilk't been loaded yet.
Taking a lock when this condition is encountered should be sufficient to avoid the problem.
Comment from ftweedal (@frasertweedale) at 2016-09-13 14:12:53
Moving priority to "minor" - this issue is being hit in CI but but only be hit in uncommon cases in production deployments... unless user has very unstable LDAP server but then they've got bigger problems :)
If you disagree with new priority let's continue discussion here or on pki-devel@.
Comment from alich at 2016-09-13 15:13:28
Please return it back to critical / high. Every test in FreeIPA using replica preparation is affected and broken by this issue :(
Comment from ftweedal (@frasertweedale) at 2016-09-14 02:55:33
Is there not a trivial workaround? (Wait a few seconds between ipa-server-install and ipa-replica-prepare). Let's split the difference and go with "major".
Comment from ftweedal (@frasertweedale) at 2016-09-14 14:15:21
attachment pki-frasertweedale-0134-Block-reads-during-reload-of-LDAP-based-profiles.patch
Comment from mharmsen (@mharmsen) at 2016-09-15 01:29:46
Per PKI Bug Council of 09/14/2016: 10.4.0
Comment from mharmsen (@mharmsen) at 2016-09-15 01:33:16
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1376226
Comment from ftweedal (@frasertweedale) at 2016-09-23 05:49:24
Pushed to master (ced5cb71c1963d5234c2360d1f2ac11d4a452d9d)
Comment from mbasti (@MartinBasti) at 2016-10-19 16:21:40
We set 30 seconds sleep before replica prepare in tests, and it is not enough. Some tests are still randomly failing (but less number than before), sometime even manual testing needs more than 5 minutes to be able create replica file. It seems to me quite long time to just getting entries from LDAP, what is the recommended value for sleep? I'm afraid that with this patch IPA in future can start failing on error "failed to start CA", because we have there limit "just" 5 minutes. Even now the dogtag restart is the longest thing that happens during IPA installation.
So this state now it not good for automate provisioning nor manual installation.
Comment from mbasti (@MartinBasti) at 2017-02-27 13:58:36
Metadata Update from @MartinBasti:
This issue was migrated from Pagure Issue #2453. Originally filed by mbasti (@MartinBasti) on 2016-09-06 13:44:47:
Please see IPA related ticket: https://fedorahosted.org/freeipa/ticket/6274 It looks for me the same issue as this reported in past: https://fedorahosted.org/pki/ticket/1702
"caIPAServiceCert" is default profile for IPA, should always exists.
This is reproducible in our CI test automation, it looks like dogtag is reporting that ready to serve, but when we execute ipa-replica-prepare too early, it fails with error "Profile caIPAserviceCert Not Found". Manually it works when delay between dogtag restart during ipa-server-install and ipa-replica-prepare is longer.
PS: we check dogtag status using http polling.