owncloud / ocis

:atom_symbol: ownCloud Infinite Scale Stack
https://doc.owncloud.com/ocis/next/
Apache License 2.0
1.36k stars 180 forks source link

[ocis] Improve user listing speed in `Admin Settings` > `Users` #8938

Open tbsbdr opened 5 months ago

tbsbdr commented 5 months ago

Description

relates to https://github.com/owncloud/enterprise/issues/6597 and https://github.com/owncloud/web/issues/10821

Userstories

As an admin I want a quickly and reliably responding usermanagement so that I have control over all user accounts. As an admin I want to select all pupils to increase their quota. As an admin I want to select all pupils from the senior classes to put them into the group "Coratia Field Trip". As an admin I want the feeling that the user management ist reliable so that I feel in control of who can access the instance.

Steps to reproduce

  1. login as Admin on an instance with ~500 users eg. on INT 0387 as user INT-o.dewald (login detailshttps://github.com/owncloud/enterprise/issues/6597)
  2. Go to Admin Settings > Users
  3. Filter eg. for Role Schüler/-in or search for the letter e
  4. Very slow response: It takes ca. 15 seconds until a result is shown

Expected behavior

Results should show up quickly, eg. < 3 seconds

Actual behavior

Very slow response: It takes ca. 15 seconds until a result is shown

Solution Scope for optimisazion

rhafer commented 4 months ago

The main problem for the "filter by role assignment" slowness is the the settings service does currently provide no way to list all assignments for a specific role. The only query that is possible is to list the assignments for a specific userid.

So in order to get all users with a specific role assigned we currently need to:

So the first thing to speed things up would be to have a performant "give me all userids, which have role XYZ assigned" query for the settings service.

rhafer commented 3 months ago

There is PoC PR (#9363), which basically adds a cache (it's a bit of a hybrid of a cache and an index) to the settings service for quick enumeration of assignments by role. It improves the lookup quite a bit. Response times for the filter-by-role request are now <1s (down from ~15s)

However, it also comes with considerable downsides. E.g. creating assignments will become slow as the number of users increases. For details see the PR. I have some doubts that the approach of maintaining the index via a JSON (or MsgPack) blob on a CS3 storage is feasible for larger systems.

I am moving this into "blocked" for now as we need to figure out how to move forward. And probably create followup tickets

tbsbdr commented 3 months ago
dragotin commented 3 months ago

A database is not a no-go obviously, but there are a few important things to consider that I try to describe here.

The high level pillars of the success of Infinite Scale are

A solution that solves the described problems must not destroy these pillars.

It is understood that this limits the amount of possible solutions. So it is important to understand the areas where we can make compromises, such as:

wkloucek commented 3 months ago

@rhafer just a spontaneous idea: for larger installations we require an external LDAP anyways, couldn't we persist role assignments in there?

rhafer commented 2 months ago

just a spontaneous idea: for larger installations we require an external LDAP anyways, couldn't we persist role assignments in there?

Certainly. But ...

Depending on the usecase we currently only require read-only access to that LDAP server. An LDAP implementation for the assignments would obviously need write access. As well as it would need Schema changes. Which is something that the folks operating the LDAP servers are usually very hesitant to do. So the external LDAP server will not really be "external" any more but become an integral part of the setup. That being said, it is of course possible to create an LDAP backend for the assignments. And with the proper configuration it would likely perform just fine for our requirements.

Also, for larger deployments we'll need an SQL DB anyways (keycloak). And most people on the team would probably find it a lot easier to create and maintain and SQL backend for the assignments, than an LDAP one (whyever that is :laughing: )

Also please note that we have similar issues with most other services that are using the CS3 metadata storage to store their data, e.g the sharing services. We've all kinds of clever improvements and workarounds for the sharing services to get them to an acceptable performance. But the underlying problem isn't going away with that. We're trying to synchronize access to shared data (be it shares or role-assignments) by accessing a (relativiely slow) shared filesystem via an API (the CS3 storage provider) that is inappropriate for that use case (IMO).

wkloucek commented 2 months ago

An LDAP implementation for the assignments would obviously need write access

which is already required for these optional features:

For a project I'm working on, we have a read-only ldap tree for regular users and a read-write tree for the two features I mentioned above.

Actually this very same project already might have the role information in the LDAP, but since we are using OIDC role assignment, we're only able to list roles after a user logged in. Therefore a LDAP based role mechanism would keep the user related information in one place.

micbar commented 2 months ago

Summary

We spent the 4 PD research. There are two main directions we can move on

Home Grown

Well established DB Backend

Embedded

Cluster DB

micbar commented 2 months ago

Closing here.

tbsbdr commented 1 month ago

Let's decide how to move on with this topic.

We still have the issue, that usermgnt on instances with like more than 1.000 users is not usable - we won't win the admin's heart❤️ with the current behaviour.

I know that this topic is controversial, so I would like to ask you @dragotin to make a decision, considering input from @micbar @rhafer @wkloucek et. al..

can take care of a decision @dragotin ?

rhafer commented 1 month ago

One thing I forgot to add here. I also experimented with storing the assignments using the nats service. This would pretty surely improve the response times.

But this is basically sufferring from the same consistency issues as https://github.com/owncloud/ocis/pull/9363 (which tries to maintain an index for the Role-to-UserIDs lookup.