awesomemotive / edd-library

EDD Code Snippet Library
http://library.easydigitaldownloads.com/
141 stars 47 forks source link

Clean up API request logs daily #132

Open rubengc opened 5 years ago

rubengc commented 5 years ago

Hi EDD team

I was facing issues with an excessive database size caused by the amount of posts and post metas generated by the EDD rest API

For that, I was in the need to create a functionality that cleans up all API requests logs and I want to share it to the EDD development team

The code is safe to run included if you make (finally) the migration of logs on another table since will query directly to database matching posts by post_type

There is the code:

function themedd_child_edd_logs_cleanup_init() {

    // Setup the daily cron event to process events daily
    if ( ! wp_next_scheduled( 'themedd_child_edd_logs_cleanup_process' ) )
        wp_schedule_event( time(), 'daily', 'themedd_child_edd_logs_cleanup_process' );

}
add_action( 'init', 'themedd_child_edd_logs_cleanup_init' );

function themedd_child_edd_logs_cleanup_process() {

    global $wpdb;

    $api_request_term = get_term_by( 'slug', 'api_request', 'edd_log_type' );

    if( ! $api_request_term ) return;

    $term_id = $api_request_term->term_id;

    // Query to remove EDD logs of api requests
    $sql = "DELETE p
    FROM {$wpdb->posts} p
    LEFT JOIN {$wpdb->term_relationships} tr ON p.ID = tr.object_id
    WHERE p.post_type = 'edd_log'
        AND tr.term_taxonomy_id = {$term_id}";

    $wpdb->query( $sql );

    // Query to remove orphaned post metas
    $sql = "DELETE pm
    FROM {$wpdb->postmeta} pm
    LEFT JOIN {$wpdb->posts} p ON p.ID = pm.post_id
    WHERE p.ID IS NULL";

    $wpdb->query( $sql );

    // Query to remove orphaned term relationships
    $sql = "DELETE tr
    FROM {$wpdb->term_relationships} tr
    LEFT JOIN {$wpdb->posts} p ON p.ID = tr.object_id
    WHERE p.ID IS NULL";

    $wpdb->query( $sql );

}
add_action( 'themedd_child_edd_logs_cleanup_process', 'themedd_child_edd_logs_cleanup_process' );

Hope it helps

cklosowski commented 5 years ago

This approach could work in cases with smaller sets of logs for sure.

My major concern with this is that it doesn't have a limit. Even though this has it's own cron, it is still going to attempt to delete ALL rows from the DB, and on the first run it's likely to cause a massive spike in the MySQL CPU and Memory usage as it runs the query and then has to reindex all the tables touched. It'd be best if it has a limit attached to it and over time, your logs prune. Maybe adding an order by post_date ASC and limit to 200 (or something that makes sense for shared hosting environments) would help with not overrunning a cron and having it hit the PHP timeout, essentially leaving us in a state where all 3 of these didn't' complete correctly.

rubengc commented 5 years ago

Hi @cklosowski

It's true that I was design this daily clean up after perform some manual clean up

For that it will just facing daily API requests (around 5.000)

A posible improvement could be, as you mention, limit the query and running it hourly I have around 5.000 API requests logs daily so a limit of 200 won't help so much