389ds / 389-ds-base

The enterprise-class Open Source LDAP server for Linux
https://www.port389.org/
Other
210 stars 89 forks source link

Allow installation of instances with completely different schema #3707

Open 389-ds-bot opened 4 years ago

389-ds-bot commented 4 years ago

Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/50652


For Global Catalog support in FreeIPA we need to create an instance that uses LDAP schema from Active Directory. AD schema is clashing with several attributes and classes defined in a normal 389-ds schema. The clash is on multiple levels:

Ideally, we would like to have an instance where the schema is fully provided as a part of FreeIPA Global Catalog work. Such schema would include minimal core required for 389-ds to work itself (@mreynolds389 promised to find out what constitutes this minimal set).

From my investigation of 389-ds-base code, it looks like if we could add a config option that removes hardcoded use of SYSTEMSCHEMADIR in init_schema_dse_ext(), then we can re-use existing per-instance schema directory.

This would need to be complemented by another flag that would disable copying schema files in lib389.instance.setup.SetupDS._install_ds:

        _ds_shutil_copytree(os.path.join(slapd['sysconf_dir'], 'dirsrv/schema'), slapd['schema_dir'])

With these two we would be able to have a completely independent schema for this instance.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2019-10-16 19:44:53

Metadata Update from @mreynolds389:

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2019-10-16 19:46:31

Thanks for the initial code analysis @abbra !

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2019-10-16 19:46:31

Metadata Update from @mreynolds389:

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2019-10-16 21:51:20

Metadata Update from @mreynolds389:

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-16 22:09:27

There is quite a bit of core schema for cn=config that we need though so we can't just ignore what's in the system schema dir. It also means importantly that if you ignore the system schema dir you will miss updates to these core schemas in your instances when package updates occur. This matters for containers and instances where we don't have rpm post install trigger schema updates etc.

I think you really need to think about a scriptless/stateless upgrade scheme here and how you will account for this so that system core schemas continue to be updated in 389, without creating extra complexity or tooling that is needed.

It's likely your "flag" to ignore system schema dir is incorrect, and actually should be a flag that says to only include core schemas from the system dir instead.

I also think you need to think about how your "gc" will actually manage it's own schema upgrades too in a clean and sustainable manner as that will probably shape the implementation of how you implement this kind of feature here.

Can you please submit a more complete design document to the 389 wiki in this case detailing this?

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-16 22:17:01

As a first example of a better solution that is much more sustainable, this change should actually split schema to:

Then we have a config that is a schema-to-use flag, defaulting to 389-sup in libglobs.c

389 then at startup always reads the "required" directory. Based on the schema-to-use flag we then process either the 389-sup directory, gc-ad, or neither.

This gives us:

This is how I would rather see this implemented.

389-ds-bot commented 4 years ago

Comment from abbra at 2019-10-16 23:07:12

@Firstyear I was going to propose the same after reading your previous comment. Yes, it would be enough to have ability to switch off the supplementary part which is where most of conflicts are.

There are conflicts at required part too but I'm going to workaround them in the schema translator. We can't go too far without that as things like 'cn' are marked as single valued in AD schema but for global catalog support in FreeIPA we can certainly remove this requirement. Or handle 'top' class required attributes differently via a supplemental class as global catalog is read-only and there will be only a well-defined way to translate from IPA content to GC content.

As for the schema, currently we produce schema at build time based on official AD schema files from Microsoft by using a converter script that doesn't depend on FreeIPA itself. Once the script is stable, we can provide it to 389-ds and you can have the same schema files included as well.

389-ds-bot commented 4 years ago

Comment from abbra at 2019-10-16 23:10:22

Note that in either case we are talking about the directory list modification in init_schema_dse_ext(), the amount of modifications is not that big there for both approaches.

I think it is a bit more involved in finding out what are the required schema elements for 389-ds itself.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-16 23:14:41

I would expect the "required" parts shouldn't be tooooo bad because it's mostly in cn=config, and so long as top/cn work there, then we are probably okay.

We also don't have the same schema as AD remember - there is no concept of marking cn as single value unique, you'd do that with attr uniqueness at the backend level, so this should mean that those changes won't have a big impact on cn=config anyway.

Worth keeping in mind you will not be able to use replication then to get IPA -> AD/GC content here because schema replication would kick in and cause issues/conflicts. You'll need some other mechanism. Also worth considering you won't be able to replicate the GC, you would rely on IPA repl and then each GC would have to be updated from it's related IPA db instead to ensure some level of consistency (IE you don't want the GC to have a conflict that's not in IPA and causes divergence).

389-ds-bot commented 4 years ago

Comment from abbra at 2019-10-16 23:24:06

Replication-wise, that was always a plan to not create another topology and instead use local means to feed GC instance off the primary instance on the same host. Since GC is read-only, this makes us possible to control all the aspects of transformation and schema -- we really need to only consider external clients accessing GC for read and their behavior with regards to expected LDAP controls etc but not what they have for write since nobody writes to GC by definition other than the 'Active Directory' itself.

'top' difference in AD is that it has nTSecurityIdentifier as a mandatory attribute (and few more unique ones). We can handle this via a supplemental class that is always added by the transformation code that feeds the data into GC.

Right now I'm dealing with the fact that as it is the whole AD schema is not loadable into 389-ds, so until this ticket is fixed, I need to find a subset of AD schema to translate successfully and start creating transformation routines to actually test the sync part.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-16 23:31:38

That's fine, that won't affect us then using top if you add a nttop class for all your extra nt attrs.

389-ds-bot commented 4 years ago

Comment from lkrispen (@elkris) at 2019-10-17 09:08:03

We need to ensure that the solution not only works for read only replicas.

The idea to split the schema into required and supplemental sounds good, but right now in a replicated topology after any change all of it would get mixed up in 99user.ldif after "schema learning", so we should also work on this. There have been different suggestions to improve schema handling starting with 496 and 49069, 49418, 49420, and soem other scheam related tickets

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-10-17 09:57:10

I also like to idea to split schema with core/required and supplemental. Now I wonder if some of the issues can not be address with 99user.ldif

99user.ldif can overwrites existing standard (/share/dirsrv/schema) definition. I think it should address the problem of new definition having different syntax/matching_rules. For definitions with same oid but different name, it could be included in the the list of alias NAME.

I think it remains the problem of different name with the same OID and attributes being single valued while it exists data (config or DB) with multiple values.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2019-10-17 22:27:59

Here is the core schema the server needs, and the conflicts....

Core Schema Files

Attribute Conflicts

OID Conflicts

For the attribute conflicts we can move them to a new file and put them in supplemental category as "streetAddress" and "name" are not used by the core server, but the OID conflicts are more of an issue. The OIDs we use are used by other LDAP vendors and are documented on multiple "non-redhat/389" sites. We could assign new OIDs for these attributes, but it could potentially break clients that for some reason look at the schema OIDs. "ref" is the one that scares me as I know that is more commonly used than the others.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-17 23:47:54

We need to ensure that the solution not only works for read only replicas. The idea to split the schema into required and supplemental sounds good, but right now in a replicated topology after any change all of it would get mixed up in 99user.ldif after "schema learning", so we should also work on this. There have been different suggestions to improve schema handling starting with 496 and 49069, 49418, 49420, and soem other scheam related tickets

It may not just be read only-s, but I can certainly see risks in if admins were to have two replicas and configure different schema options on them. One would have to learn from the other, but it would at least keep learning on upgrades. Worst case you configure conflicting schemas on them.

Perhaps when we go to add this configuration option, we should put in something like "nsslapd-unsafe-use-alternate-schema".

IMO the "defaults" chosen here must work for us in 389-ds primarily and our deployments. It's only on configuration that the schema could be minimised or replaced to the AD style.

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-10-18 16:49:06

@mreynolds389 ,

wouldn't it be an option to add (overwrite) streetAdresse and 'name' in 99user.ldif For OID conflict, would it be possible to add them into 99user.ldif with a rather then a digit OID. AFAIK GC instance will not replicate so it is only for local definition.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-21 02:03:41

I think we've needed a way to over-load schema from 99user.ldif for a while, so that would help here I think.

IE instead of rejectingthe duplicate OID, we take the "latest" version from the .ldif, meaning 99user.ldif always over-rides all else.

Saying that this could have un-intended consequences .... but it would solve some of @abbra's needs, and it would solve the rfc2307/rfc2307bis issues we've had.

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-10-21 11:25:05

The attached patch should resolve the problem of new attribute (definition and name) using an already used OID, allowing 99user.ldif to overwrite the existing definition 0001-Ticket-50652-Allow-installation-of-instances-with-co.patch

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-10-21 12:03:15

Error in the previous patch 0001-Ticket-50652-Allow-installation-of-instances-with-co.patch

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-22 00:26:30

@tbordaz Would this allow any of the ldifs to overwrite a former content? For example something in 10example.ldif to override 00core.ldif? This is the behaviour I want to allow people to be able to over-ride with rfc2307bis for example.

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-10-22 10:48:43

The ordered list of files (both from /share and ) are loaded into a single cn=schema entry, that is parse. During parsing the overwrite flag for attributetypes (SLAPI_ATTR_FLAG_OVERRIDE), should force new attributetypes value and this is what this "tentative" patch was doing. There are multiple corner cases and in this specific case (reusing an OID for a different name) the flag was not enforced. It could exist others corner cases. I have not seen such flag for objectclasses, so possibly conflicting objectclasses are not forced.

In short, if rfc2307bis overwrite attributestypes definitions it should work. For objectclasses I have doubt

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-23 01:45:27

I think objectclasses will possibly be a problem for @abbra too, so we should consider how to approach that in addition to this patch. Is it possible to have this patch as a PR?

389-ds-bot commented 4 years ago

Comment from abbra at 2019-10-23 08:19:02

I have now an installation that loads a modified AD schema into 389-ds instance. 389-ds is patched with @tbordaz patch (and few temporary other patches to aid with logging around schema loading). It is available in my copr abbra/gc-wip.

I had to tune quite a bit my schema converter so that we aren't conflicting anymore on object class level between 389-ds and AD schema. This means, for example, that bunch of AD classes got renamed and assigned new OIDs from FreeIPA space. So far, I had to re-assign seven classes and filter 32 additional ones.

The idea is to ensure all original object classes are present in the objects that will be created in GC (GC is read-only for all consumers, so we have full control what goes in) in addition with the real object classes that add required attributes from the original AD object classes. For example, 'top' in AD has way more attributes than in 389-ds, so the converter automatically rename it to 'ad-top' and the latter will be added to objects along with 'top'. As long as data translator script will handle the addition of 'ad-top' to the objects, we should be OK.

@tbordaz, please, submit the patch as a pull request, I think it solves at least part of the problem we have and is worth adding it.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-10-25 01:30:58

Look, freeipa isn't any of my business but the amount of OID and class changing does not seem like something that is good as a solution here. I think we need to allow objectClass over-rides, not just attribute ones to resolve this ....

And I already asked @tbordaz to make this a PR ....

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-11-07 15:17:37

Preparing a PR for #3707#comment-606068, the patch can create trouble (for example OID hashtable containing duplicates) and moreover @abbra removed (filtered) the problematic definitions. At the end of the day, AD schema can be loaded on DS on master branch. So there is no need for this patch anymore.

Is there any other pending patch or issues with this ticket ?

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-11-08 00:25:34

What's the PR in progress? Still 50652?

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2019-11-08 09:09:31

@Firstyear, the patch allowed to override conflicting attribute definitions but it was more a hack to accelerate GC testing. One of the concern with the patch is that OID hashtable table would have duplicate and there was no guaranty that during lookup the overriding definition would come first. Also it was difficult to anticipate others side effect after relaxing the schema override.

In parallel @abbra removed the problematic definitions and GC does no longer need this patch.

So I will not create a PR for the patch (that is abandoned). The question I was asking is if this ticket is still valid and expects a fix from 389-ds.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2019-11-11 01:18:08

Okay, I think that's up to @abbra to answer then .... but don't we need to split up and clean our schema into core, ad and 389?

389-ds-bot commented 4 years ago

Comment from abbra at 2019-11-11 07:49:35

I think it would be good to split the schema into separate parts, indeed.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2020-03-04 16:19:01

Metadata Update from @mreynolds389:

389-ds-bot commented 4 years ago

Comment from tbordaz (@tbordaz) at 2020-03-05 12:59:09

@Firstyear should we consider this ticket as a duplicate of #3986 ?

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2020-03-06 03:29:29

No I don't think so - I think that this is about running a tottaly different core (AD vs 389 ds schema) rather than the rfc2307 issue (having to make concessions inside of the 389 schema only). So I think this remains valid and different.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2020-04-01 17:20:44

Metadata Update from @mreynolds389: