boonebgorges / buddypress-group-email-subscription

Fine-grained email subscription for activity in BuddyPress groups
36 stars 32 forks source link

Async digest routine gets stuck if user digest send fails #145

Closed boonebgorges closed 6 years ago

boonebgorges commented 6 years ago

I've got the async stuff running on a production site and have hit a problem.

The async routine works like this:

a. During each batch, send digests for as many users as possible before running out of memory/time b. This is done by querying, one at a time, for get_users_with_pending_digest( $type, $timestamp ). $timestamp ensures that we are only looking for items that were posted before the digest routine began. c. When a user's digest is processed, the processed items are deleted from the queued_items table. This way, the next time get_users_with_pending_digest() is run, it won't find the user, and will instead find the next one with pending items.

If queued-item deletion fails for some reason in step c, then the next time the system goes back to fetch a pending user (step b) it will find the same user again, and again, and again, and again, and again, and you get the picture.

Queued items are deleted from the queued_items table when they're included in a successful digest. There are several reasons why this might not happen; see bpges_generate_digest():

  1. The activity query doesn't find an item corrseponding to the activity ID. This, in turn, can happen for a few reasons: i. Bugs. See #144, which has been fixed, but there could be others like it. ii. Filters on BP queries (inside bp_activity_get() etc) iii. The activity item has been deleted since the item was added to the digest queue.
  2. The activity item was deemed "not valid for digest". See bp_ges_activity_is_valid_for_digest(). By default, this happens only when the activity item is too old, but it can also happen by filter.

The specifics of these situations are different, but BPGES must be more failure-tolerant in each of them. There are a couple strategies.

// get_users_with_pending_digest()
// can't do subquery or join because of situations where global table is on different host
$meta_key = "bpges_digest_processed_{$type}_{$timestamp}";
$received_user_ids_raw = $wpdb->get_col( $wpdb->prepare( "SELECT user_id FROM {$wpdb->usermeta} WHERE meta_key = %s" ) );
$received_user_ids = implode( ',', array_map( 'intval', $received_user_ids_raw ) );
$user_ids = $wpdb->get_col( $wpdb->prepare( "SELECT DISTINCT user_id FROM {$table_name} WHERE type = %s AND date_recorded < %s AND user_id NOT IN ({$received_user_ids}) LIMIT %d", $type, $timestamp, $count ) );

// handle_digest_queue()
$user_id = BPGES_Queued_Item_Query::get_user_with_pending_digest( $type, $timestamp );
$meta_key = "bpges_digest_processed_{$type}_{$timestamp}";
bp_update_user_meta( $user_id, $meta_key, 1 );

// Then at the end of the run I could clear out all the usermeta.

I think that this last solution is the most viable one, though I'd welcome feedback, especially from @modelm and others who deal with complex situations like this. For example, is it best to use bp_update_user_meta() here, given that you might want to query this information in a multinetwork environment?

boonebgorges commented 6 years ago

f75d42a is a rough implementation. I'll run it on my test site for a few days to see if I have any luck.

boonebgorges commented 6 years ago

It looks like it's working well, so I've merged to the branch.