boonebgorges / buddypress-docs

GNU General Public License v3.0
106 stars 44 forks source link

Improve scalability of permissions check #618

Closed ChrisWiegman closed 1 year ago

ChrisWiegman commented 6 years ago

Improve scalability of permissions check by only retrieving relative IDs as opposed to the ID of all posts.

Currently the permissions check runs a query to retrieve all IDs to check for appropriate permissions. This allows for the ability to only return the current ID if it is in the list. On large sites this prevents resource errors that can otherwise occur on installs with 10,000+ documents.

Note I'm really not sure of where the best branch for this is and I'm fully open to moving or any other changes.

dcavins commented 6 years ago

Hi Chris-

Thanks for your PR. It's a great idea to minimize the check. I'm thinking that the fix as suggested would break the cached item$protected_doc_ids, in some cases. (If you were to do a single-doc check first, then follow it up with a non-scoped check, $protected_doc_ids would already exist as the result from that narrow check, so the second check would be wrong, I believe.) You can check this by running this code, for instance:

$bp_docs_access_query = BP_Docs_Access_Query::init( $user_id );
$protected_doc_ids    = $bp_docs_access_query->get_doc_ids($a_doc_id);
$protected_doc_ids    = $bp_docs_access_query->get_doc_ids();

I'm wondering if maybe the way to optimize this call is by changing the logic in bp_docs_allow_activity_item_visibility(). In any case, the plugin's primary author is away this week, so will weigh in on his return.

Thanks again for using and working to improve BP Docs.

boonebgorges commented 6 years ago

@ChrisWiegman Thank you for the PR!

We already have the functionality to check whether a user has access to read a specific Doc: current_user_can( 'bp_docs_read', $doc_id ). See 07d66d4. This doesn't address the more general issue of scalability for this function, but it does provide a more direct way of getting at the specific issue in this PR. Can you have a look to see whether it seems workable to you?

@dcavins A sanity check from you would be nice too :-D

Let me take a closer look here to see if we can make these checks smarter in general.

boonebgorges commented 6 years ago

624 eliminates the forbidden_fruit query for pagename queries, which is important because the "main" query for each BP page is a pagename query. So that change should eliminate the forbidden-fruit query on many pageloads.

ChrisWiegman commented 6 years ago

Thanks @boonebgorges I'll take a look at that later today or next week.

ChrisWiegman commented 6 years ago

@boonebgorges Had to go back into the issue here. We're not trying to hook into anything with Docs. We're simply uploading new docs. When that happens the plugin itself runs a query returning all docs on the site to check for appropriate permissions. We have so many docs that this breaks our servers. This PR is hoping to reduce that default load. Is there are better way to do that? I don't see where current_user_can() will apply to this one as it's purely core functionality of the plugin.

boonebgorges commented 6 years ago

@ChrisWiegman Hm - I guess my comment about current_user_can() was based on your PR. Specifically, at https://github.com/boonebgorges/buddypress-docs/pull/618/files#diff-a3a6add1690a16cb3830892d6e44306cL559 you suggest that we don't need to do a permission check that calls up every single Doc on the server. This seems correct to me; I was suggesting in https://github.com/boonebgorges/buddypress-docs/commit/07d66d40e7dc53cce58c397c9b178147e9af0f89 that we use current_user_can() as a replacement for the $bp_docs_access_query check that you modified in your PR (since our current_user_can() check does not require the BP_Docs_Access_Query). So, if your PR addresses the issue, mine should address it as well, though in a different manner. Does that seem correct to you?

ChrisWiegman commented 6 years ago

That does sound correct and I'll update the PR. (apparently I'm looking at too many things this morning).

boonebgorges commented 6 years ago

(apparently I'm looking at too many things this morning).

ha ha, no worries. Please do look at the logic with a skeptical eye - my logic seems right, but I want to be certain that it actually solves the problem as it exists in your environment :-D

dcavins commented 6 years ago

Hi all-

Yes, I agree that switching to the current_user_can() check on each doc makes sense.

This is sort of a theme for me lately, where I've added big-object caching some time ago in an effort to avoid lookups, which works great when you have more than a pageload's worth of docs, but not a huge number. Once you have a zillion, then that cache object is too unwieldy. This new check will potentially require a lookup per doc, but the underlying function (bp_docs_get_doc_settings()) uses get_post_meta(), so I'm betting that, most of the time, WP's built-in post meta caching will field the request.

I think this will be a better answer for most sites. Thanks @boonebgorges and @ChrisWiegman!

ChrisWiegman commented 5 years ago

As I've left the org that owns the fork for this and won't be working on the product I'm going to close this PR. Hopefully someone will be able to pick up where I had left off.

dcavins commented 5 years ago

I think we should merge 07d66d4, then close this issue. :)

ChrisWiegman commented 5 years ago

You won't find an objection from me. As I no longer have the ability to test the code however I will defer to the judgement of y'all as to whether it is ready.

boonebgorges commented 5 years ago

@dcavins If you're happy, then merge away :-D