pki-bot / pki-issues-final

0 stars 0 forks source link

NSS db migration #2653

Open pki-bot opened 3 years ago

pki-bot commented 3 years ago

This issue was migrated from Pagure Issue #3104. Originally filed by slev (@stanislavlevin) on 2019-08-13 05:41:24:


During FreeIPA upgrade from an old version (4.3.3) to a new one (4.7.2) pki-tomcatd@pki-tomcat.service fails with:

pki-tomcatd@pki-tomcat.service - PKI Tomcat Server pki-tomcat
   Loaded: loaded (/lib/systemd/system/pki-tomcatd@.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-08-13 08:15:47 MSK; 52min ago
  Process: 2026 ExecStartPre=/usr/sbin/pki-server migrate --instance pki-tomcat (code=exited, status=1/FAILURE)

Aug 13 08:15:46 dc.ipa.test systemd[1]: Starting PKI Tomcat Server pki-tomcat...
Aug 13 08:15:46 dc.ipa.test pki-server[2026]: ERROR: /var/lib/pki/pki-tomcat/alias contains an incomplete NSS database in SQL format
Aug 13 08:15:47 dc.ipa.test systemd[1]: pki-tomcatd@pki-tomcat.service: Control process exited, code=exited, status=1/FAILURE
Aug 13 08:15:47 dc.ipa.test systemd[1]: pki-tomcatd@pki-tomcat.service: Failed with result 'exit-code'.
Aug 13 08:15:47 dc.ipa.test systemd[1]: Failed to start PKI Tomcat Server pki-tomcat.
# LANG=C ls -la /var/lib/pki/pki-tomcat/alias/
-rw------- 1 pkiuser pkiuser 65536 Aug 13 09:12 cert8.db
-rw------- 1 root    root    28672 Aug 13 09:12 cert9.db
-rw------- 1 pkiuser pkiuser 24576 Aug 13 09:12 key3.db
-rw------- 1 root    root    28672 Aug 13 09:12 key4.db
-r-------- 1 pkiuser pkiuser    13 Aug  8 11:53 pwdfile.txt
-rw------- 1 pkiuser pkiuser 16384 Jul 16 16:30 secmod.db

There is a partially upgraded NSS db. As it's known, ( https://fedoraproject.org/wiki/Changes/NSSDefaultFileFormatSql ) an implicit migration takes place on write open.

certmonger during the same RPM upgrade process restarted and re-read the tracked certs. https://pagure.io/certmonger/blob/master/f/src/certread-n.c#_103 The root cause of this issue is NSS_INIT_NOMODDB flag, used by certmonger in NSS_InitContext. Actually, certmonger just triggers the issue.

NSS_INIT_NOMODDB - Don't open the security module DB, just initialize the PKCS 11 module.

The very simple reproducer in pytest is attached.

pki-bot commented 3 years ago

Comment from slev (@stanislavlevin) at 2019-08-13 05:42:09

repr.py

pki-bot commented 3 years ago

Comment from rcritten (@rcritten) at 2019-09-30 09:19:18

Perhaps NSS should not initiate a migration when opened with NSS_INIT_NOMODDB. I'm not sure this is a bug in certmonger.

pki-bot commented 3 years ago

Comment from slev (@stanislavlevin) at 2019-09-30 09:24:03

I could open a ticket against NSS. But looks like the migration process is not standardized.

pki-bot commented 3 years ago

Comment from slev (@stanislavlevin) at 2019-11-06 03:05:09

Mozilla upstream ticket: https://bugzilla.mozilla.org/show_bug.cgi?id=1586192

pki-bot commented 3 years ago

Comment from slev (@stanislavlevin) at 2019-11-08 06:42:51

With recent Certmonger changes (reopening without NSS_INIT_NOMODDB just after permissions check): https://pagure.io/certmonger/c/34c120f0259750ff2228def2955de9ad985340e6?branch=master

the first part of my problem is hidden:

LANG=C ls -la /var/lib/pki/pki-tomcat/alias/
total 200
drwxrwx--- 2 pkiuser pkiuser  4096 Nov  8 13:47 .
drwxrwx--- 5 pkiuser pkiuser  4096 Nov  8 13:45 ..
-rw------- 1 pkiuser pkiuser 65536 Nov  8 13:45 cert8.db
-rw------- 1 root    root    40960 Nov  8 13:45 cert9.db
-rw------- 1 pkiuser pkiuser 24576 Nov  8 13:45 key3.db
-rw------- 1 root    root    61440 Nov  8 13:45 key4.db
-rw------- 1 root    root      429 Nov  8 13:45 pkcs11.txt
-r-------- 1 pkiuser pkiuser    13 Nov  8 13:47 pwdfile.txt
-rw------- 1 pkiuser pkiuser 16384 Nov  8 13:04 secmod.db

Now as you can see there is root problem, pkiuser gives up.