In a number of places where the H5P WP plugin fetches a list of content, it needs information about the WP users who created the content. In these cases, it performs a JOIN against WP's users table {$wpdb->users}.
Joining against WP's users table is problematic, however. Some very large WordPress installations, especially with the Multisite feature enabled, store their data across multiple databases. In a sample setup, there will be 257 (16^2 + 1) databases, where "global" tables such as wp_blogs, wp_users, and wp_usermeta will be in the +1 "global" database, while sites will be assigned one of the other 256 databases based on a base-16 md5 hash of the site's blog_id. See https://github.com/rrennick/shardb, https://wordpress.org/plugins/hyperdb/. As such, the H5P tables on (say) the site with blog_id=44 are not in the same database as wp_users, meaning that the JOIN will always fail. This means that H5P content lists never load correctly.
I've looked through the specific JOINs in the plugin and it appears that the sole purpose is to fetch user data. You're never including a WHERE subclause that references wp_users, which means that the JOIN is not strictly necessary. Instead, it's possible to do a second query, after the primary content query, to fetch user data. This avoids the multi-database JOIN problem. It also has secondary performance benefits, since (a) it's faster not to JOIN, and (b) standalone user queries, such as the get_users() call in this PR, are cached by WP.
My PR intentionally tries to minimize changes, which is why I left much of the structure of H5PContentQuery (looping through $valid_fields etc) in place. I just had to make minor exceptions in these loops so that user_name is skipped when building the SQL query.
I'm happy to work through any concerns with the proposed approach, if you decide that this is worth addressing in your plugin. (For the record, without this PR, the plugin is largely unusable on the site of my client, which is a large public university.)
In a number of places where the H5P WP plugin fetches a list of content, it needs information about the WP users who created the content. In these cases, it performs a
JOIN
against WP's users table{$wpdb->users}
.Joining against WP's users table is problematic, however. Some very large WordPress installations, especially with the Multisite feature enabled, store their data across multiple databases. In a sample setup, there will be 257 (16^2 + 1) databases, where "global" tables such as
wp_blogs
,wp_users
, andwp_usermeta
will be in the +1 "global" database, while sites will be assigned one of the other 256 databases based on a base-16 md5 hash of the site'sblog_id
. See https://github.com/rrennick/shardb, https://wordpress.org/plugins/hyperdb/. As such, the H5P tables on (say) the site withblog_id=44
are not in the same database aswp_users
, meaning that theJOIN
will always fail. This means that H5P content lists never load correctly.I've looked through the specific JOINs in the plugin and it appears that the sole purpose is to fetch user data. You're never including a
WHERE
subclause that referenceswp_users
, which means that theJOIN
is not strictly necessary. Instead, it's possible to do a second query, after the primary content query, to fetch user data. This avoids the multi-databaseJOIN
problem. It also has secondary performance benefits, since (a) it's faster not toJOIN
, and (b) standalone user queries, such as theget_users()
call in this PR, are cached by WP.My PR intentionally tries to minimize changes, which is why I left much of the structure of
H5PContentQuery
(looping through$valid_fields
etc) in place. I just had to make minor exceptions in these loops so thatuser_name
is skipped when building the SQL query.I'm happy to work through any concerns with the proposed approach, if you decide that this is worth addressing in your plugin. (For the record, without this PR, the plugin is largely unusable on the site of my client, which is a large public university.)