cea-hpc / robinhood

Robinhood Policy Engine : a versatile tool to monitor filesystem contents and schedule actions on filesystem entries.
http://robinhood.sf.net
Other
182 stars 62 forks source link

rbh-report -a does not include database query in duration. #107

Closed geraldhofer closed 4 years ago

geraldhofer commented 5 years ago

rbh-report does include a very helpful output that shows the duration of the last run policy:

Policy 'lhsm_archive':
    Current run started on 2019/07/09 15:20:52: trigger: scheduled (daemon), target: all
    Last complete run: trigger: scheduled (daemon), target: all
        - Started on 2019/07/09 14:39:30
        - Finished on 2019/07/09 14:40:51 (duration: 01min 21s)
        - Summary: 1734 successful actions, volume: 7.60 GB; 13 entries skipped; 1238 errors

But I learned that that is only half the story. I have turned on the logging of slow queries in mariadb and this is the database query that gets run as part of that policy:

# Time: 190709 14:39:30
# User@Host: robinhood[robinhood] @ localhost []
# Thread_id: 93161  Schema: robinhood_lustre  QC_hit: No
# Query_time: 1587.420402  Lock_time: 0.000074  Rows_sent: 2985  Rows_examined: 510645709
SET timestamp=1562647170;
SELECT ENTRIES.id AS id FROM ENTRIES WHERE ENTRIES.type='file' AND (ENTRIES.lhsm_noarch<>1 OR ENTRIES.lhsm_noarch IS NULL) AND (ENTRIES.lhsm_status='new' OR ENTRIES.lhsm_status='modified') AND (ENTRIES.invalid=0 OR ENTRIES.invalid IS NULL) AND NOT (ENTRIES.fileclass LIKE BINARY '%+scratch+%') AND (((ENTRIES.last_mod<=1562641983 OR ENTRIES.last_mod IS NULL) AND ENTRIES.fileclass LIKE BINARY '%+small_files+%') OR ((ENTRIES.last_mod<=1562641983 OR ENTRIES.last_mod IS NULL) AND (ENTRIES.fileclass LIKE BINARY '%+std_files+%' OR ENTRIES.fileclass LIKE BINARY '%+big_files+%')) OR ((ENTRIES.last_mod<=1562641983 OR ENTRIES.last_mod IS NULL))) AND ENTRIES.last_mod>=901253296 ORDER BY ENTRIES.last_mod ASC LIMIT 100000;

So that policy does spent the majority of the time in the database query. It would be very helpful if that database query could be part of the duration that is reported by rbh-report -a.

dtcray commented 5 years ago

agree with you, first DB query is not considered as part of policy run duration. this should do it, need to be tested.

--- policy_run.c.orig   2019-10-22 10:41:30.787000000 +0200
+++ policy_run.c        2019-10-22 10:44:29.637000000 +0200
@@ -1963,6 +1963,8 @@
     nb_returned = 0;
     total_returned = 0;

+    p_pol_info->progress.policy_start = time(NULL);
+
     rc = iter_open(lmgr,
                    p_pol_info->descr->manage_deleted ? IT_RMD : IT_LIST,
                    &it, &filter, &sort_type, &opt);
@@ -1974,8 +1976,7 @@
         return -1;
     }

-    p_pol_info->progress.policy_start = p_pol_info->progress.last_report
-        = time(NULL);
+    p_pol_info->progress.last_report = time(NULL);

     /* reinit schedulers */
     for (i = 0; i < p_pol_info->config->sched_count; i++) {
tl-cea commented 4 years ago

Fix included in version 3.1.6: https://sourceforge.net/projects/robinhood/files/robinhood/3.1.6/