Closed sssd-bot closed 12 months ago
As a workaround, setting up ad filters for users and groups to limit the amount of results drastically improves the situation. SSSD appears to get results about computer accounts, which is unneeded as far as i can tell.
I came looking here to see if there were solutions to this problem of slow lookups from AD.
What is a suggested user or group filter for Active Directory to speed up queries?
I came looking here to see if there were solutions to this problem of slow lookups from AD.
What is a suggested user or group filter for Active Directory to speed up queries?
Hi,
typically the delay is caused by groups with many users which can be avoided by adding ignore_group_members = True
to the [domain/...]
section of sssd.conf
. IF your environment has multiple domains you might need to inherit the option the the other domains as well by adding subdomain_inherit = ignore_group_members
as well. Please see man sssd.conf for details.
HTH
bye, Sumit
Here is a basic example, we limit the user search base to just the OU where user accounts are. That will prevent SSSD from looking over computer accounts, which are also users. Frankly, SSSD should just ignore computer account by default.
Then the group search base with a filter to only include certain groups (e.g. sg-mygroup
). That avoids a problem where SSSD enumerates the groups for a user, and then every member of each group, repeat, until pretty much everything is enumeraeted. That is an incredibly expensive operation in an environment with many users and groups.
ldap_user_search_base = OU=Users*,DC=example,DC=com
ldap_group_search_base = DC=example,DC=com?subtree?(|(name=Domain Users)(name=sg-mygroup))
This configuration lowered my lookups to seconds. The Domain Users
group is needed otherwise SSSD doesn't see any users. You don't need to enable other options that break certain functionality, such as compat tree, and getent group ....
Why does sssd enumerate group members at all, if not asked for it? Given that I issue id
command, I am only interested in groups of a specific user, so it would be sufficient to just ask AD for groups of that user (and that works), but skip enumerating all group members of those groups (may be done later if someone asks for it, e.g. getent group somegroup
).
Since every user is a member of Domain Users
, asking for anyone's id
will result in enumerating all the directory.
Why does sssd enumerate group members at all, if not asked for it?
It's a flaw in POSIX. :(
You want to print the names of the groups that you're a member of. You call getgrgid(3); this fills out a struct group with the following information:
Unfortunately, there's no way to opt out of receiving the group members, cursing us to ten thousand years of frustrated twiddling our thumbs every time we launch bash (thanks Ubuntu).
Dear Contributor/User,
Recognizing the importance of addressing enhancements, bugs, and issues for the SSSD project's quality and reliability, we also need to consider our long-term goals and resource constraints.
After thoughtful consideration, regrettably, we are unable to address this request at this time. To avoid any misconception, we're closing it; however, we encourage continued collaboration and contributions from anyone interested.
We apologize for any inconvenience and appreciate your understanding of our resource limitations. While you're welcome to open a new issue (or reopen this one), immediate attention may not be guaranteed due to competing priorities.
Thank you once again for sharing your feedback. We look forward to ongoing collaboration to deliver the best possible solutions, supporting in any way we can.
Best regards, André Boscatto
Cloned from Pagure issue: https://pagure.io/SSSD/sssd/issue/4062
Issue
When I attempt to run
id someuser
(AD user) on CentOS host enrolled to FreeIPA it can take anywhere from 3 minutes to 17 minutes to get a result. In the meantime the host seems to flood the master servers with LDAP queries. Removing sssd cache filesrm -f /var/lib/sss/db/*
and restarting sssd resolves the issue temporary. Runningid
again takes couple of seconds. The behaviour returns after a while, maybe 2 hours. I have not tried to time it more precisely. In the meantime the client has not been touched.Running
id
for the same user in the freeipa masters works with little to no delay.Details about the setup:
The FreeIPA domain running on 2 servers with a trust to AD domain. The AD domain is a decent size and has over 1000 users, over 1000 computers, and many groups. The servers are not any load, there is plenty of resource. There is no network connectivity issues.
Steps to Reproduce
id aduser
Actual behavior
Running
id aduser
takes many minutes to return a resultExpected behavior
Running
id aduser
return result almost instantlyVersion/Release/Distribution
client:
servers:
Additional info:
/var/log/sssd/sssd_intra.company.com.log
/etc/sssd/sssd.conf
Comments
Comment from sbose at 2019-08-16 09:55:52
Hi,
according to the log snippets you've added SSSD on the client waits for 6s on a reply from the IPA server, then it timeouts and treats the request as failed. While it would be expected that SSSD tries another server if defined in sssd.conf it is not expected that it tries to connect to the same server that many times. So it looks like there is an issue which prevents SSSD from switching into offline mode in the case.
But in general there shouldn't be such a delay in the response of the server since you said that calling
id
for the given user in the IPA server returns pretty fast the server should be able to answer the request from the client fast as well, even if all entries in the cache in the server must be refreshed as well.To better understand what is going on on the server it would be good if you can add corresponding server logs. For a start I wonder if you can send the directory server access log
/var/log/dirsrv/slapd-YOUR-DOMAIN/access
from the same time interval as the SSSD logs from above or from another time interval where this issue occurs on a client?bye, Sumit
Comment from rgp at 2019-08-16 12:59:53
Hi @sbose, I looked through the logs and they look normal. The client seem to query every AD/external account, both user and computer accounts, and all groups. Well over 10,000 queries. Here is a snip, there are other, but all seem to query the trust view. I cannot include 20-30k lines log.
There is nothing special around the "disconnect" time. Just an unbind. Looks completely normal. No errors. I think the client timeouts internally or something. I'm not familiar with the internals of SSSD.
On the client, the
id
command takes minutes even after subsequent runs. Isn't SSSD supposed to cache? Here is me running it 3 times, one after the other:Then if I remove the cache files its suddenly instant:
If I leave it alone for about an hour, and run
id
again, again over 3 minutes.Comment from sbose at 2019-08-16 13:46:44
Hi,
this request which times out is not an ordinary search but an extended operation. This can be identified in the access lgos of the directory server by
EXT oid="2.16.840.1.113730.3.8.10.4.1"
instead ofSRCH
. Can you add some of those together with theRESULT
which has the sameconn
andop
values?SSSD caches the data with a lifetime, after cached entry is expired, SSSD tries to read it again from the server. If the server is not reachable or the requests times out SSSD should switch into offline mode and return the data from the cache. As I said it looks like SSSD does not switch into offline mode here, which has to be fixed.
But what is puzzling is why SSSD cannot refresh the data. Do you really have to remove the cached data or is calling
sssctl restart sssd
sufficient to make the lookup fast again?bye, Sumit
Comment from rgp at 2019-08-16 17:36:51
Putting the queries together is tricky as client disconnects, I see like over 10 connections. There is a connection from the FreeIPA server to itself that does thousands of queries. All the queries look ok.
err=0
on all of them, nothing unusual.I'm not sure if it relevant, but sssd on the masters has:
Everything else is default.
On client, just restarting
sssd
doesn't seem to make a big difference. Removing the cache seems to do the best job. I did the following experimentation:Comment from sbose at 2019-08-20 12:06:13
Can you try to collect some of the
EXT oid="2.16.840.1.113730.3.8.10.4.1"
requests with the matchingRESULT
?Can you add a tar ball with the full SSSD logs after restart without removing the cache?
bye, Sumit
Comment from rgp at 2019-09-02 16:31:49
Hi @sbose, here is what I ran:
Here are all EXT queries from the dirserv for that duration from the client: https://paste.fedoraproject.org/paste/jw6Ft385xE~~OJ4cCFlcDw/raw
Full client log for the IPA domain: https://paste.fedoraproject.org/paste/ysZkzRkWAWuoRnuSn8sb0w/raw
You mentioned tar, do you want all sssd client logs? If yes, what debug levels do I set? Where can I upload the tar file?
Comment from rgp at 2019-09-12 12:05:16
Any suggestions?
Comment from thalman at 2020-03-13 15:52:59
Metadata Update from @thalman: