Closed 389-ds-bot closed 4 years ago
Comment from mreynolds (@mreynolds389) at 2020-03-05 17:49:33
Metadata Update from @mreynolds389:
Comment from firstyear (@Firstyear) at 2020-04-01 01:13:53
Metadata Update from @Firstyear:
Comment from tbordaz (@tbordaz) at 2020-04-02 17:44:52
Test basic_tests.py on master (dd266dacc) shows
[02/Apr/2020:17:41:37.479944074 +0200] - ERR - dse_read_one_file - The entry cn=schema in file /home/tbordaz/install_master/share/dirsrv/schema/10rfc2307compat.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - object class nisMap: The name does not match the OID "1.3.6.1.1.1.2.9". Another object class is already using the name or OID.
[02/Apr/2020:17:41:37.486711277 +0200] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
Comment from tbordaz (@tbordaz) at 2020-04-02 17:45:03
Metadata Update from @tbordaz:
Comment from firstyear (@Firstyear) at 2020-04-03 00:28:52
@tbordaz You may need to make clean, wipe your prefix installed 389 and make install again. This won't be a problem for rpm upgrades though because the file will be moved automatically.
Comment from tbordaz (@tbordaz) at 2020-04-03 10:49:24
@Firstyear, you are right it failed because of missing initial clean up. Sorry for the noise.
At the moment I have a concern regarding schema replication. rfc2307compat may raise conflicts with obsolete schema definitions (for example duplicate OID nisdomain vs memberuid). The schema replication does not enforce OID conflict (in addition there may be others conflicts than OID). The risk is that once new rfc2307compat definition will be propagated, the conflicts will prevent DS restart and break a full topology.
An option is to move rfc2307compat to 'data' as a rapid workaround. It will give us time to evaluate the safety of each rfc2307compat definition and introduce them in 'schema' as we check they are safe.
Comment from frenaud at 2020-04-05 10:52:58
The conflict is preventing ipa server installation, please see logs in IPA PR-CI:
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] [25/44]: restarting directory server
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] Failed to restart the directory server (CalledProcessError(Command ['/bin/systemctl', 'restart', 'dirsrv@IPA-TEST.service'] returned non-zero exit status 1: 'Job for dirsrv@IPA-TEST.service failed because the control process exited with error code.\nSee "systemctl status dirsrv@IPA-TEST.service" and "journalctl -xe" for details.\n')). See the installation log for details.
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] [error] NetworkError: cannot connect to 'ldapi://%2Fvar%2Frun%2Fslapd-IPA-TEST.socket': Connection refused
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] cannot connect to 'ldapi://%2Fvar%2Frun%2Fslapd-IPA-TEST.socket': Connection refused
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information
[ipatests.pytest_ipa.integration.host.Host.master.cmd28] Exit code: 1
With the corresponding error log:
[04/Apr/2020:14:17:43.686664808 +0000] - ERR - dse_read_one_file - The entry cn=schema in file /etc/dirsrv/slapd-IPA-TEST/schema/15rfc2307bis.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - attribute type nisDomain: Does not match the OID "1.3.6.1.4.1.1.1.1.12". Another attribute type is already using the name or OID.
[04/Apr/2020:14:17:43.688978778 +0000] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
Comment from firstyear (@Firstyear) at 2020-04-06 01:26:02
@frenaud That looks like the same issue as mark/thierry were hitting, are you building from source? If so you need to remove the 10rfc2307.ldif from your schema dir then rebuild.
Note you'll still have the issue with 60nis.ldif which I will resolve this morning.
Comment from firstyear (@Firstyear) at 2020-04-06 04:34:39
PREFIX/share/dirsrv/data/60nis.ldif
to their /etc/dirsrv/slapd-instance/schema/
in their own update process as we have no way to automatically manage this, and custom schema management is the responsibility of the user of that custom schema.If you have issues with this, you MAY find that in some cases autotools is not rebuilding your schema templates OR you may have the old rfc2307.ldif in place. You should check:
This is because in a make install there is no means to remove an installed file that we no longer install, so artifacts may remain like this in the install prefix.
This will not be an issue with rpm upgrades, as RPM does correctly handle a clean build root and then removing and adding resources that do or do not exist between updates.
Comment from tbordaz (@tbordaz) at 2020-04-06 08:05:24
Just for recording, it triggers freeipa upstream failure https://bugzilla.redhat.com/show_bug.cgi?id=1820176. Referencing this BZ here rather than metadata as it is a side effect of the PR
Comment from abbra at 2020-04-06 09:11:51
@Firstyear we do not install 60nis.ldif
in FreeIPA, so there is no conflict with 60nis.ldif
. The conflict appears because previously only 15rfc2307.ldif
from FreeIPA was providing nisDomain
attribute, now 10rfc2307compat.ldif
from 389-ds provides it.
We will be changing definition of the nisDomain
in FreeIPA to be the same as in your updated 10rfc2307compat.ldif
but we need to understand how to handle this for existing multi-master environments where both files are already deployed. Could you please help with this?
Comment from tbordaz (@tbordaz) at 2020-04-06 12:27:42
In replication topology, if the schema definition of an attribute differs because of OID between two instances then
Each instance will fail to send (push) their schema. The mismatching OID is not a problem but others new definitions will not be pushed. Hopefully the learning mechanism should detect/learn new definitions even if the schema is not pushed.
Master1 ( 1.3.6.1.1.1.1.300.1 NAME 'dummy' DESC 'my dummy test' EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 X-ORIGIN 'user defined' ) [06/Apr/2020:11:42:07.811443170 +0200] conn=2 op=7 MOD dn="cn=schema" [06/Apr/2020:11:42:07.900949631 +0200] conn=2 op=7 RESULT err=20 tag=103 nentries=0 etime=0.090672552
Master2 ( 1.3.6.1.1.1.1.300.2 NAME 'dummy' DESC 'my dummy test' EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 X-ORIGIN 'user defined' ) [06/Apr/2020:11:42:07.938011051 +0200] - ERR - NSMMReplicationPlugin - agmt="cn=001" (pctbordaz:39001): Schema replication update failed: Type or value exists [06/Apr/2020:11:42:07.948572696 +0200] - ERR - NSMMReplicationPlugin - agmt="cn=001" (pctbordaz:39001): Warning: unable to replicate schema: rc=1
Comment from tbordaz (@tbordaz) at 2020-04-06 14:47:55
DS accepts an attribute definition if it is new (new OID and new name) or if it replaces an already existing definition (matches OID and matches name). If there is a partial match (matches OID but not the name, or the opposite).
It is possible to relax these controls (via a kind of permissive mode) and make successful the replication of the schema.
I think this permissive mode should be enabled during a short period of time. For example during upgrade. That means a new config parameter :(
But even with a "permissive" mode, currently there is no rule to indicate which OID or attribute name is the good one. For example with 'nisDomain' being either 1.3.6.1.4.1.1.1.1.12 or 1.3.6.1.4.1.1.1.1.30 there is no way to be sure that 1.3.6.1.4.1.1.1.1.30 will be the kept one.
Comment from firstyear (@Firstyear) at 2020-04-07 05:24:37
@tbordaz But we don't need the schema to replicate here - we just need to wait for all the masters to be updated and then they'll all have the correct rfc2307/compat files inplace.
Comment from tbordaz (@tbordaz) at 2020-04-07 10:34:24
If rfc2307/compat definitions are replicated or installed, some customers will still have the problem of obsolete nisDomain definitions (from obsolete 60nis.ldif or from obsolete private schema file like #3987#comment-115218). So as soon as a replica is upgraded, all the others replicas must be upgraded and obsolete definitions removed manually. So this version is incompatible with previous one. It looks not possible to me.
What about remove 'nisDomain' from rfc2307compat and let it/fix it in 60nis.ldif (in data). Healthcheck would verify this.
Comment from tbordaz (@tbordaz) at 2020-04-07 17:59:54
@Firstyear, considering the value of rfc2307compat for openldap migration it should stay in shared/schema.
Two definitions are problematic (nisDomain/nisDomainObject) that use different OID between 60nis.ldif and rfc2307compat. Instances using those definitions in a mixed versions deployment will fail to exchange their schema (schema will not be the shared and potentially not match the data). Deploying in a raw a new version on all instances is not a common practice so customer will hit this issue.
To prevent this, would you please move back in 60nis.ldif (under data) the two definitions. Those definitions will be removed in rfc2307compat.ldif (under schema).
As a followup we will create a fix to relax the schema checking that prevent duplicate OID, with a config parameter. Add the invalid OID checking in healthcheck tool. Then in future release and release note, explain how to run in mixed versions (healthcheck, relax schema checking).
Comment from firstyear (@Firstyear) at 2020-04-08 07:22:20
We can't move them back to 60nis, as parts of rfc2307bis/compat relies on them ....
So they have to go into compat here.
Comment from tbordaz (@tbordaz) at 2020-04-08 10:39:06
why ? those definitions are quite isolated in rfc2307compat. Which definitions relies on them ?
attributeTypes: (
1.3.6.1.1.1.1.30 NAME 'nisDomain'
DESC 'NIS domain'
EQUALITY caseIgnoreIA5Match
SYNTAX 1.3.6.1.4.1.1466.115.121.1.26
)
objectClasses: (
1.3.6.1.1.1.2.15 NAME 'nisDomainObject' SUP top AUXILIARY
DESC 'Associates a NIS domain with a naming context'
MUST nisDomain
)
Comment from firstyear (@Firstyear) at 2020-04-09 01:10:59
60nis.ldif is not include by default in our installs .... but these values are in rfc2307bis. So for this to be compat between bis and 2307, they have to be in compat ...
Comment from tbordaz (@tbordaz) at 2020-04-09 11:22:08
I agree those definitions are in 60nis.ldif and rfc2307bis.ldif. But both are 389-ds optional (data) and a new install/upgrade will not bring those definitions as default. The problem will appear if those definitions are installed by default (in rfc2307compat) as they can be incompatible with definitions already deployed in a replicated topology. This will happen with deployment using the current 60nis.ldif and all freeipa deployment.
Comment from firstyear (@Firstyear) at 2020-04-14 07:48:14
But rfc2307compat won't be optional - it's replacing 2307. And 2307bis is the default in openldap which is what I want to be able to import from .... That's why I would really say we need them available.
Comment from abbra at 2020-04-14 08:46:50
@Firstyear since making it non-optional breaks existing deployments, I'd rather focus on making sure it is compatible with existing deployments before rolling out this feature.
That's the ask here, it is fine to change the schema and I applaud the work you are doing to make migration and use easier. However, it is impossible to upgrade existing deployments where old schema is used now and that is a huge problem here. While you may argue that only FreeIPA and non-default installations of RHCS are affected, the reality is that FreeIPA cannot solve this problem alone because it is 389-ds replication mechanism which breaks here.
My suggestion would be to move rfc2307compat.ldif
into optional schema for now and have the replication problem fixed first, then move rfc2307compat.ldif
to the main schema set.
Comment from firstyear (@Firstyear) at 2020-04-15 01:58:49
@Firstyear since making it non-optional breaks existing deployments, I'd rather focus on making sure it is compatible with existing deployments before rolling out this feature.
But it doesn't, the whole point is that it upgrades smoothly. So far every "break" is because of source builds not cleaning their installed schema trees properly, which is not a problem that will affect any rpm releases. No one has provided evidence to the contrary of this.
That's the ask here, it is fine to change the schema and I applaud the work you are doing to make migration and use easier. However, it is impossible to upgrade existing deployments where old schema is used now and that is a huge problem here. While you may argue that only FreeIPA and non-default installations of RHCS are affected, the reality is that FreeIPA cannot solve this problem alone because it is 389-ds replication mechanism which breaks here.
But it is possible to upgrade these old existing deployments, this sounds very much like an IPA problem, not a 389 one .... and as mentioned if you use custom schema in your project, you have a responsibility to manage and upgrade that, and you need mechanisms to handle it. It sounds like you assume 389-ds can never change and will remain the same forever.
My suggestion would be to move rfc2307compat.ldif into optional schema for now and have the replication problem fixed first, then move rfc2307compat.ldif to the main schema set.
Temporary fixes become permanent. It's better to fix this properly, now, even if it's a bit more effort, than it is to delay where it will always be a "future" problem, and never actually resolved, pushing more and more burden to our users who want to consume the feature - which from SUSE is a major goal for us in the platform.
In fact, your suggestion would only make it worse, because making compat optional means that users of 60nis.ldif would need to bring in rfc2307 compat. And the majority of the reason to use a compat by default over bis, is that the system installed schema is hard to use/ignore/replace, so we need something that works out of the box, for the widest set of deployments of 389-ds possible.
As mentioned, so far it's not proven that it actually breaks any instance, it seems to be environmental ... the compat ldif is in the master branch for over a week now, and everyone has been happily developing since, and these extra fixes are for an edge case with 60nis.ldif, that you initially claimed to use, but then you don't use? So now there are some edge cases fixes for 2307compat that are being held up, because of an unrelated development only problem.
Comment from abbra at 2020-04-15 10:27:00
Did you ever try any upgrade of an existing FreeIPA instance to use 389-ds with rfc2307compat.ldif? Or pure 389-ds instance with 60nis.ldif
enabled and then upgrading to 389-ds with rfc2307compat.ldif
?
Even single server upgrade is failing for me. If I don't have the code that attempts to replace 15rfc2307bis.ldif
with a version compatible with new 389-ds' 10rfc2307compat.ldif
, I get the following error on start:
Apr 15 08:07:06 master.ipa.test systemd[1]: Starting 389 Directory Server IPA-TEST....
Apr 15 08:07:07 master.ipa.test ns-slapd[33389]: [15/Apr/2020:08:07:07.040442756 +0000] - ERR - dse_read_one_file - The entry cn=schema in file /etc/dirsrv/slapd-IPA-TEST/schema/15rfc2307bis.ldif (lineno: 1) is invalid, error code 20 (Type or value exis>
Apr 15 08:07:07 master.ipa.test ns-slapd[33389]: [15/Apr/2020:08:07:07.044322860 +0000] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
Apr 15 08:07:07 master.ipa.test systemd[1]: dirsrv@IPA-TEST.service: Main process exited, code=exited, status=1/FAILURE
Apr 15 08:07:07 master.ipa.test systemd[1]: dirsrv@IPA-TEST.service: Failed with result 'exit-code'.
Apr 15 08:07:07 master.ipa.test systemd[1]: Failed to start 389 Directory Server IPA-TEST..
if I have synchronized 15rfc2307bis.ldif
FreeIPA installs into the instance as part of our upgrade process before restarting 389-ds to perform actual upgrade, the failure happens in 99user.ldif
which contains that schema too:
Here is ipa-server-upgrade log excerpt that shows me applying upgrade replacement of 15rfc2307bis.ldif
before starting 389-ds instance again:
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __save_config 0.06 sec
2020-04-15T08:14:24Z DEBUG [2/10]: disabling listeners
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __disable_listeners 0.05 sec
2020-04-15T08:14:24Z DEBUG [3/10]: enabling DS global lock
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __enable_ds_global_write_lock 0.05 sec
2020-04-15T08:14:24Z DEBUG [4/10]: disabling Schema Compat
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __disable_schema_compat 0.07 sec
2020-04-15T08:14:24Z DEBUG [5/10]: pre-check RFC2307compat schema conflict
2020-04-15T08:14:24Z ERROR Upgrade /etc/dirsrv/slapd-IPA-TEST/schema/15rfc2307bis.ldif to /usr/share/ipa/15rfc2307bis.ldif
2020-04-15T08:14:24Z ERROR Upgrading /etc/dirsrv/slapd-IPA-TEST/schema/15rfc2307bis.ldif
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __correct_rfc2307compat_schema 0.00 sec
2020-04-15T08:14:24Z DEBUG [6/10]: starting directory server
2020-04-15T08:14:24Z DEBUG Starting external process
2020-04-15T08:14:24Z DEBUG args=['/bin/systemctl', 'start', 'dirsrv@IPA-TEST.service']
2020-04-15T08:14:24Z DEBUG Process finished, return code=1
2020-04-15T08:14:24Z DEBUG stdout=
2020-04-15T08:14:24Z DEBUG stderr=Job for dirsrv@IPA-TEST.service failed because the control process exited with error code.
See "systemctl status dirsrv@IPA-TEST.service" and "journalctl -xe" for details.
2020-04-15T08:14:24Z DEBUG Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/ipaserver/install/service.py", line 603, in start_creation
run_step(full_msg, method)
File "/usr/lib/python3.8/site-packages/ipaserver/install/service.py", line 589, in run_step
method()
File "/usr/lib/python3.8/site-packages/ipaserver/install/upgradeinstance.py", line 105, in __start
srv.start(self.serverid, ldapi=True)
File "/usr/lib/python3.8/site-packages/ipaplatform/redhat/services.py", line 136, in start
super(RedHatDirectoryService, self).start(
File "/usr/lib/python3.8/site-packages/ipaplatform/base/services.py", line 308, in start
ipautil.run([paths.SYSTEMCTL, "start",
File "/usr/lib/python3.8/site-packages/ipapython/ipautil.py", line 597, in run
raise CalledProcessError(
ipapython.ipautil.CalledProcessError: CalledProcessError(Command ['/bin/systemctl', 'start', 'dirsrv@IPA-TEST.service'] returned non-zero exit status 1: 'Job for dirsrv@IPA-TEST.service failed because the control process exited with error code.\nSee "systemctl status dirsrv@IPA-TEST.service" and "journalctl -xe" for details.\n')
2020-04-15T08:14:24Z DEBUG [error] CalledProcessError: CalledProcessError(Command ['/bin/systemctl', 'start', 'dirsrv@IPA-TEST.service'] returned non-zero exit status 1: 'Job for dirsrv@IPA-TEST.service failed because the control process exited with error code.\nSee "systemctl status dirsrv@IPA-TEST.service" and "journalctl -xe" for details.\n')
2020-04-15T08:14:24Z DEBUG [cleanup]: stopping directory server
2020-04-15T08:14:24Z DEBUG Starting external process
2020-04-15T08:14:24Z DEBUG args=['/bin/systemctl', 'stop', 'dirsrv@IPA-TEST.service']
2020-04-15T08:14:24Z DEBUG Process finished, return code=0
2020-04-15T08:14:24Z DEBUG stdout=
2020-04-15T08:14:24Z DEBUG stderr=
2020-04-15T08:14:24Z DEBUG Stop of dirsrv@IPA-TEST.service complete
2020-04-15T08:14:24Z DEBUG step duration: dirsrv __stop_instance 0.02 sec
2020-04-15T08:14:24Z DEBUG [cleanup]: restoring configuration
The failure comes due to 99user.ldif
still having old nisDomain definition:
[15/Apr/2020:08:14:24.710251400 +0000] - ERR - dse_read_one_file - The entry cn=schema in file /etc/dirsrv/slapd-IPA-TEST/schema/99user.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - attribute type nisDomain: Does not match the OID "1.3.6.1.4.1.1.1.1.12". Another attribute type is already using the name or OID.
[15/Apr/2020:08:14:24.712344971 +0000] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
So, yes, it does break existing instance.
Comment from tbordaz (@tbordaz) at 2020-04-15 12:04:30
@Firstyear in addition to OID conflict of nisDomain, running an MMR test with M1+rfc2307compat and M2+rfc2307 shows others failures (with replication logging level) that prevent to push the schema.
Comment from mreynolds (@mreynolds389) at 2020-04-15 15:45:00
@Firstyear since making it non-optional breaks existing deployments, I'd rather focus on making sure it is compatible with existing deployments before rolling out this feature.
But it doesn't, the whole point is that it upgrades smoothly. So far every "break" is because of source builds not cleaning their installed schema trees properly, which is not a problem that will affect any rpm releases. No one has provided evidence to the contrary of this.
William, even if I install fresh RPMs the server still fails to start because of the conflicting schema OID issue:
[15/Apr/2020:09:38:34.353886309 -0400] - ERR - dse_read_one_file - The entry cn=schema in file /usr/share/dirsrv/schema/15rfc2307bis.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - attribute type nisDomain: Does not match the OID "1.3.6.1.4.1.1.1.1.12". Another attribute type is already using the name or OID. [15/Apr/2020:09:38:34.361220650 -0400] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
So right now Master branch is broken! If this is not fixed soon we will need to revert this patch because I need to do fresh upstream builds.
Comment from abbra at 2020-04-15 15:51:36
More to that. If 389-ds instance was created before 10rfc2307compat.ldif
was created to replace 10rfc2307.ldif
and 10rfc2307bis.ldif
, upgrade to 389-ds with 10rfc2307compat.ldif
will fail too. This is doesn't need to have FreeIPA to present because 10rfc2307.ldif
was present by default in 389-ds installations.
For example, nisMap
was present in 10rfc2307.ldif
and despite the fact that nothing uses it, it appeared in 99user.ldif
. Upgrading to 389-ds 1.4.3.5+ fails to start 389-ds instance because nisMap
definition is present in 99user.ldif
:
[15/Apr/2020:13:25:31.899463021 +0000] - ERR - dse_read_one_file - The entry cn=schema in file /etc/dirsrv/slapd-IPA-TEST/schema/99user.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - object class nisMap: The name does not match the OID "1.3.6.1.1.1.2.13". Another object class is already using the name or OID.
[15/Apr/2020:13:25:31.902566520 +0000] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server.
I only was able to proceed when I manually removed nisMap
object class definition from the 99user.ldif
.
For FreeIPA I implemented removal of attributes and objectclasses referenced in 15rfc2307bis.ldif
(provided by FreeIPA) from 99user.ldif
during upgrade so that we can work with the new schema proposed by @Firstyear. However, nisMap
is not used (and was never used) by FreeIPA.
Comment from tbordaz (@tbordaz) at 2020-04-15 15:59:10
@abbra thanks for upgrade test. Having nisMap in 99user.ldif is an old know bug in the replication of schema that fills 99user.ldif of a consumer even if the definition in 99user.ldif is the same as the ones in consumer schema files.
This old bug is #1081
Comment from abbra at 2020-04-15 16:47:38
@mreynolds389 the 15rfc2307bis.ldif
is from FreeIPA. I have a fix that will upgrade FreeIPA part (15rfc2307bis.ldif
and associated definitions in 99user.ldif
) before starting actual upgrade. You can take FreeIPA packages from COPR abbra/test-rfc2307compat
for the test (Fedora 32).
https://copr.fedorainfracloud.org/coprs/abbra/test-rfc2307compat/
Comment from mreynolds (@mreynolds389) at 2020-04-15 17:43:16
@Firstyear since making it non-optional breaks existing deployments, I'd rather focus on making sure it is compatible with existing deployments before rolling out this feature. But it doesn't, the whole point is that it upgrades smoothly. So far every "break" is because of source builds not cleaning their installed schema trees properly, which is not a problem that will affect any rpm releases. No one has provided evidence to the contrary of this.
William, even if I install fresh RPMs the server still fails to start because of the conflicting schema OID issue: [15/Apr/2020:09:38:34.353886309 -0400] - ERR - dse_read_one_file - The entry cn=schema in file /usr/share/dirsrv/schema/15rfc2307bis.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - attribute type nisDomain: Does not match the OID "1.3.6.1.4.1.1.1.1.12". Another attribute type is already using the name or OID. [15/Apr/2020:09:38:34.361220650 -0400] - ERR - setup_internal_backends - Please edit the file to correct the reported problems and then restart the server. So right now Master branch is broken! If this is not fixed soon we will need to revert this patch because I need to do fresh upstream builds.
Okay, master branch is not broken. Previously I removed the 389-ds-base packages, but the /usr/share/dirsrv/schema directory was NOT cleaned up after removing the package (different bug?). So I removed all the packages, removed the entire directory /usr/share/dirsrv/schema, and reinstalled 389 packages - then it works out of the box. But as others have noted schema replication breaks, so that's still a major problem.
Comment from firstyear (@Firstyear) at 2020-04-16 00:24:07
Did you ever try any upgrade of an existing FreeIPA instance to use 389-ds with rfc2307compat.ldif? Or pure 389-ds instance with 60nis.ldif enabled and then upgrading to 389-ds with rfc2307compat.ldif?
It's not my job to test and install FreeIPA.
Even single server upgrade is failing for me. If I don't have the code that attempts to replace 15rfc2307bis.ldif with a version compatible with new 389-ds' 10rfc2307compat.ldif, I get the following error on start: ... snip So, yes, it does break existing instance.
Can you list the content of the /usr/share/dirsrv/schema dir and the /etc/dirsrv/slapd-
Okay, master branch is not broken. Previously I removed the 389-ds-base packages, but the /usr/share/dirsrv/schema directory was NOT cleaned up after removing the package (different bug?). So I removed all the packages, removed the entire directory /usr/share/dirsrv/schema, and reinstalled 389 packages - then it works out of the box. But as others have noted schema replication breaks, so that's still a major problem.
When you have two servers, the upgrade happens on A, then that will prevent schema repl to B, but when you upgrade 389-ds on B, that will cause the elemnts to be removed from 99user.ldif on the start up as they should be duplicate. So if that's not working then something else is wrong with schema repl which this situation is just highlighting ....
Comment from firstyear (@Firstyear) at 2020-04-16 00:26:21
Anyway, if there is going to be a "revert" of this, the change is in Makefile.am
sampledata_DATA = ldap/admin/src/scripts/DSSharedLib \
...
$(srcdir)/ldap/schema/10rfc2307.ldif \
systemschema_DATA = $(srcdir)/ldap/schema/00core.ldif \
...
$(srcdir)/ldap/schema/10rfc2307compat.ldif \
Swap these two lines so that compat is optional, and 2307 is default. But of course, I will be aiming to have this become the default in the future still.
Comment from abbra at 2020-04-16 09:54:10
Did you ever try any upgrade of an existing FreeIPA instance to use 389-ds with rfc2307compat.ldif? Or pure 389-ds instance with 60nis.ldif enabled and then upgrading to 389-ds with rfc2307compat.ldif?
It's not my job to test and install FreeIPA.
There is really no need to be passive-agressive. A bug is a bug regardless whether you like it or not.
I tried right now with pure 389-ds setup:
60nis.ldif
to the instance schema and started the instancednf copr enable abbra/test-rfc2307compat
)Start of the instance failed because
[16/Apr/2020:07:34:24.332840458 +0000] - ERR - dse_read_one_file - The entry cn=schema in file /etc/dirsrv/slapd-rfc2307/schema/60nis.ldif (lineno: 1) is invalid, error code 20 (Type or value exists) - attribute type nisDomain: Does not match the OID "1.3.6.1.4.1.1.1.1.12". Another attribute type is already using the name or OID.
However, 99user.ldif
does not have any attribute/objectclass defined in this case.
So even for a plain 389-ds instance with optional schema provided by the 389-ds installation there is no clear upgrade path, it will fail to start. I had to copy over 60nis.ldif
to make sure the instance would start. In past this was done automatically (see 60upgradeschemafiles.pl
).
When you have two servers, the upgrade happens on A, then that will prevent schema repl to B, but when you upgrade 389-ds on B, that will cause the elemnts to be removed from 99user.ldif on the start up as they should be duplicate. So if that's not working then something else is wrong with schema repl which this situation is just highlighting ....
Correct, the issue with 99user.ldif
affecting replication was known in past but nothing in the schema forced the failure so far during the system upgrade.
Perhaps 389-ds needs to learn how to do upgrade to optional schema files installed in the instance? E.g. if instance schema includes the same name as in data directory, update it from the data directory.
Old perl-based DSMigration.pm
module had support for upgrade and removal of updated attributes/objectclasses from 99user.ldif
. And there was logic in 60upgradeschemafiles.pl
to rewrite 99user.ldif
, in addition to rewrite of optional schema files, including 60nis.ldif
. It seems this logic was completely removed when new Python-based code was made default.
Comment from lkrispen (@elkris) at 2020-04-16 10:40:19
Hi, I am late in the discussion and did not follow all the detailled scenarios, but here are some comments:
1] If there still are scenarios which fail or break upgrade, and Alexander, MArk and Thierry run into this, they will maybe be able to work around, but if we deliver this it will be a support case generator. So my suggestion is:
1.1] back this change out 1.2] define all installation and upgrade paths that need to be supported, 389 does not have to test freeipa, but they need to get a chance to fix eventual problems on their side 1.3] rethink what should be the default. If the main purpose of this change is to make openldap migration possible/easier the default should be what is working in all deployments right now and the openldap schema extension should be the optional one
2] some general thoughts on schema management, not sure if they can and should be connetcted to this change. 2.1] 99user.ldif - its original purpose is to provide custom schema definitions, but with schema learning it will have a full set of the schema once a difference is detected and the schema is updated. We should have a function or functionality or script to clean this up: remove all definitions from 99user.ldif which are not different from the ones in a spcific schema file. So, if this file is upgrade to a newer one it will not conflict with the old ones still preserved in 99user.ldif 2.2] default/optional files. We have many schema files in three different locations and all the files from share and instancedir are read and processed, but many deployments only need a subset. I think in openldap you need to define which schema files should be processed, couldn't we do something similar. Provide the files in data and share and require to list which ones are used eg in a 00user.ldif, listing all the ones we want to use.
Comment from mreynolds (@mreynolds389) at 2020-04-16 15:08:45
Anyway, if there is going to be a "revert" of this, the change is in Makefile.am sampledata_DATA = ldap/admin/src/scripts/DSSharedLib \ ... $(srcdir)/ldap/schema/10rfc2307.ldif \
systemschema_DATA = $(srcdir)/ldap/schema/00core.ldif \ ... $(srcdir)/ldap/schema/10rfc2307compat.ldif \
Swap these two lines so that compat is optional, and 2307 is default. But of course, I will be aiming to have this become the default in the future still.
Thanks William I will be making this temporary change to the Makefile. I think we need to add the overall schema discussion to our upcoming virtual team gathering...
Comment from mreynolds (@mreynolds389) at 2020-04-16 16:17:01
Commit 9ede55d2 relates to this ticket
Comment from firstyear (@Firstyear) at 2020-04-17 03:38:02
Anyway, if there is going to be a "revert" of this, the change is in Makefile.am sampledata_DATA = ldap/admin/src/scripts/DSSharedLib \ ... $(srcdir)/ldap/schema/10rfc2307.ldif \ systemschema_DATA = $(srcdir)/ldap/schema/00core.ldif \ ... $(srcdir)/ldap/schema/10rfc2307compat.ldif \ Swap these two lines so that compat is optional, and 2307 is default. But of course, I will be aiming to have this become the default in the future still.
Thanks William I will be making this temporary change to the Makefile. I think we need to add the overall schema discussion to our upcoming virtual team gathering...
Agreed. I feel like I just stepped onto a landmine ....
Comment from firstyear (@Firstyear) at 2020-05-14 06:39:58
Comment from tbordaz (@tbordaz) at 2020-07-17 10:56:57
WIth this PR the attached testcase is still failing (it differs from the one in #3986#comment-642326, to move 60samba3.ldif to data to allow startup).
The schema definition 'objectCategory' is not propagated from M1 to M2 because the schema can not be pushed because of conflicts (you may turn nsslapd-errorlog-level=8192 to know the details of the conflicts)
Comment from mreynolds (@mreynolds389) at 2020-07-21 15:34:09
Metadata Update from @mreynolds389:
Comment from mreynolds (@mreynolds389) at 2020-07-21 15:34:09
Issue linked to Bugzilla: Bug 1859219
Comment from firstyear (@Firstyear) at 2020-08-05 04:07:48
Commit 79d5f2cf fixes this issue
Comment from mreynolds (@mreynolds389) at 2020-08-05 14:26:17
Commit 79d5f2cf fixes this issue
Isn't there another commit that should be here as well? :-)
Comment from mreynolds (@mreynolds389) at 2020-08-05 14:26:19
Metadata Update from @mreynolds389:
Comment from firstyear (@Firstyear) at 2020-08-06 01:30:36
I only put the fixes on the actual commit that adds the support, not the second commit that does the enable disable. I also wasn't sure if pagure could cope with two commits fixing one issue.
Anyway, the other commit is:
Commit a204115 fixes this issue
Comment from mreynolds (@mreynolds389) at 2020-08-10 18:18:08
@Firstyear We have a problem with this patch and using mixed versions of DS - replication breaks because there are conflicting OIDs for the same attribute (nisMap). We need to feature gate this, change schema replication, or revert this patch...
Comment from mreynolds (@mreynolds389) at 2020-08-10 18:18:08
Metadata Update from @mreynolds389:
Comment from firstyear (@Firstyear) at 2020-08-11 01:30:43
@Firstyear We have a problem with this patch and using mixed versions of DS - replication breaks because there are conflicting OIDs for the same attribute (nisMap). We need to feature gate this, change schema replication, or revert this patch...
Good thing there is a nice easy to access revert commit!
But I'm confused here. The definitions in rfc2307.ldif are the same as 60nis.ldif? There has to a different problem here, I'm really confused. These were not changed at all. I think there is something else going on ...
Can I get the /etc/slapd directories of both affected servers?
Also remember, that the amount of time you will run on split DS versions is small so this shouldn't be a major problem in a true production environment too.
Comment from firstyear (@Firstyear) at 2020-08-11 02:12:55
Okay. @mreynolds389 and @tbordaz thinking about this, I wonder if we could disabled oid checking and rely on the names and aliases as the primary key of the attribute.
Today ldap's schema designed around this idea that the composite key (for an sql term) of oid + name + aliases is unique. But the oid has little value or use, it's the names and aliases that truly matter about an attributes uniqueness. Could we consider disabling oid conflict checking and rely on the names/aliases as the primary key of an attribute or class?
I feel like this process is revealing a lot of cracks in our schema handling right now, so I'm wondering how we can make it more robust.
Comment from tbordaz (@tbordaz) at 2020-08-11 13:20:22
Are 'nis*' definitions required in rfc2307compat ?
If you want to relax the checking OID/name, you may try the following tentative fix that worked for me. If we want to relax the check, it should be toggle by a new config param :(
diff --git a/ldap/servers/slapd/attrsyntax.c b/ldap/servers/slapd/attrsyntax.c
index 7de006c00..d492476a0 100644
--- a/ldap/servers/slapd/attrsyntax.c
+++ b/ldap/servers/slapd/attrsyntax.c
@@ -979,7 +979,7 @@ attr_syntax_add(struct asyntaxinfo *asip, PRUint32 schema_flags)
*/
if (NULL != (oldas_from_name = attr_syntax_get_by_name_locking_optional(
asip->asi_name, !nolock, schema_flags))) {
- if (0 == (asip->asi_flags & SLAPI_ATTR_FLAG_OVERRIDE) || (oldas_from_oid != oldas_from_name)) {
+ if (0 == (asip->asi_flags & SLAPI_ATTR_FLAG_OVERRIDE)) {
/* failure; no override flag OR OID and name don't match */
rc = LDAP_TYPE_OR_VALUE_EXISTS;
goto cleanup_and_return;
Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/50933
Issue Description
rfc2307 is the original schema for posix and other related attributes. rfc2307bis was a draft propsed by a member of the openldap team that fixed a number of deficiencies in rfc2307. However, rfc2307bis is not completely forward compatible - replacing them may introduce possible data errors or other subtle issues.
In the interest of allowing easier openldap to 389 migrations ( https://pagure.io/389-ds-base/issue/50544 ) I propose a rfc2307compat, which is a forward compatible version combining rfc2307 and rfc2307bis. This would allow items from both to be considered "valid' without changing the semantics of either.