Closed anhnongdan closed 7 years ago
Verify actions metrics on #32
MariaDB [pw1]> select count(*) from piwik_log_action where type!=1 and type!=4;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
In Actions plugins: Pageview -> NB_PAGE_HIT:
PiwikMetrics::INDEX_PAGE_NB_HITS => array(
'aggregation' => 'sum',
'query' => "count(*)"
),
MultiSite get pageview from Actions: const NB_PAGEVIEWS_METRIC = 'Actions_nb_pageviews'; but MultiSite's actions is different: const NB_ACTIONS_METRIC = 'nb_actions';
In LogAggregator:
protected function getActionsMetricFields()
{
return array(
Metrics::INDEX_NB_VISITS => "count(distinct " . self::LOG_ACTIONS_TABLE . ".idvisit)",
Metrics::INDEX_NB_UNIQ_VISITORS => "count(distinct " . self::LOG_ACTIONS_TABLE . ".idvisitor)",
Metrics::INDEX_NB_ACTIONS => "count(*)",
);
}
After fixing day visit calculation: https://github.com/anhnongdan/BimaxCore#6
DEV Total: 199 visits, 4,400 hits, 10,657 actions, 1.64 G tranffered overall This 4400 hits matches with Actions' number
STANDARD Total: 199 visits, 8,100 pageviews, 8,100 actions, 0 revenue)
Apparently, action calculation have 2 branches: sum of all visits' total_actions or count(*) from action table:
DEBUG VisitsSummary[2017-08-11 10:49:28] SELECT
DEBUG VisitsSummary[2017-08-11 10:49:28] count(distinct log_visit.idvisitor) AS `1`,
DEBUG VisitsSummary[2017-08-11 10:49:28] count(*) AS `2`,
DEBUG VisitsSummary[2017-08-11 10:49:28] sum(log_visit.visit_total_actions) AS `3`,
DEBUG VisitsSummary[2017-08-11 10:49:28] max(log_visit.visit_total_actions) AS `4`,
DEBUG VisitsSummary[2017-08-11 10:49:28] sum(log_visit.visit_total_time) AS `5`,
DEBUG VisitsSummary[2017-08-11 10:49:28] sum(case log_visit.visit_total_actions when 1 then 1 when 0 then 1 else 0 end) AS `6`,
DEBUG VisitsSummary[2017-08-11 10:49:28] sum(case log_visit.visit_goal_converted when 1 then 1 else 0 end) AS `7`,
DEBUG VisitsSummary[2017-08-11 10:49:28] count(distinct log_visit.user_id) AS `39`
DEBUG VisitsSummary[2017-08-11 10:49:28] FROM
DEBUG VisitsSummary[2017-08-11 10:49:28] piwik_log_visit AS log_visit
DEBUG VisitsSummary[2017-08-11 10:49:28] WHERE
DEBUG VisitsSummary[2017-08-11 10:49:28] log_visit.visit_last_action_time >= ?
DEBUG VisitsSummary[2017-08-11 10:49:28] AND log_visit.visit_last_action_time <= ?
DEBUG VisitsSummary[2017-08-11 10:49:28] AND log_visit.idsite IN (?)
DEBUG VisitsSummary[2017-08-11 10:49:28] LogAggr.queryVisitsByDimension: bind: ["2017-08-07 16:00:00","2017-08-07 16:09:59",3]
With new calculating process, we can see:
DEBUG VisitsSummary[2017-08-14 04:27:38] ArchiveSelector::getArchiveIdAndVisits dateStartIso: 2017-08-08 23:50:00 EndIso: 2017-08-08 23:59:59 ts_archiveUTC: 2017-08-08 15:59:59
DEBUG VisitsSummary[2017-08-14 04:27:38] ArchiveProcessor::getAggregatedNumericMetrics, recalculate visit for day
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr::getGeneralQueryBindParams: start:2017-08-07 16:00:00 end:2017-08-08 15:59:59
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr.queryVisitsByDimension: without ranking query: /* trigger = CronArchive */
DEBUG VisitsSummary[2017-08-14 04:27:38]
DEBUG VisitsSummary[2017-08-14 04:27:38] SELECT
DEBUG VisitsSummary[2017-08-14 04:27:38] count(*) AS `2`,
DEBUG VisitsSummary[2017-08-14 04:27:38] sum(log_visit.visit_total_time) AS `5`,
DEBUG VisitsSummary[2017-08-14 04:27:38] sum(case log_visit.visit_goal_converted when 1 then 1 else 0 end) AS `7`
DEBUG VisitsSummary[2017-08-14 04:27:38] FROM
DEBUG VisitsSummary[2017-08-14 04:27:38] piwik_log_visit AS log_visit
DEBUG VisitsSummary[2017-08-14 04:27:38] WHERE
DEBUG VisitsSummary[2017-08-14 04:27:38] log_visit.visit_last_action_time >= ?
DEBUG VisitsSummary[2017-08-14 04:27:38] AND log_visit.visit_last_action_time <= ?
DEBUG VisitsSummary[2017-08-14 04:27:38] AND log_visit.idsite IN (?)
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr.queryVisitsByDimension: bind: ["2017-08-07 16:00:00","2017-08-08 15:59:59",3]
DEBUG VisitsSummary[2017-08-14 04:27:38] ArchiveProcessor::recalculateVisitAndDurationForDay, recalculate result: 199, 171920
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr::getGeneralQueryBindParams: start:2017-08-07 16:00:00 end:2017-08-08 15:59:59
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr.queryVisitsByDimension: without ranking query: /* trigger = CronArchive */
DEBUG VisitsSummary[2017-08-14 04:27:38]
DEBUG VisitsSummary[2017-08-14 04:27:38] SELECT
DEBUG VisitsSummary[2017-08-14 04:27:38] count(distinct log_visit.idvisitor) AS `1`,
DEBUG VisitsSummary[2017-08-14 04:27:38] count(distinct log_visit.user_id) AS `39`
DEBUG VisitsSummary[2017-08-14 04:27:38] FROM
DEBUG VisitsSummary[2017-08-14 04:27:38] piwik_log_visit AS log_visit
DEBUG VisitsSummary[2017-08-14 04:27:38] WHERE
DEBUG VisitsSummary[2017-08-14 04:27:38] log_visit.visit_last_action_time >= ?
DEBUG VisitsSummary[2017-08-14 04:27:38] AND log_visit.visit_last_action_time <= ?
DEBUG VisitsSummary[2017-08-14 04:27:38] AND log_visit.idsite IN (?)
DEBUG VisitsSummary[2017-08-14 04:27:38] LogAggr.queryVisitsByDimension: bind: ["2017-08-07 16:00:00","2017-08-08 15:59:59",3]
DEBUG VisitsSummary[2017-08-14 04:27:38] ArchiveProcessor::getAggregatedNumericMetrics, The returned metric: nb_uniq_visitors=199,nb_visits=199,nb_actions=10657,nb_users=0,max_actions=121,sum_visit_length=171920,bounce_count=4,nb_visits_converted=0,nb_visit_converted=0
DEBUG VisitsSummary[2017-08-14 04:27:39] PluginsArchiver::callAggregateAllPlugins: Initializing archiving process for all plugins [visits = 199, visits converted = 0]
DEBUG VisitsSummary[2017-08-14 04:27:39] PluginsArchiver::callAggregateAllPlugins: Archiving period reports for plugin 'Actions'.
DEBUG Actions[2017-08-14 04:27:39] [Thangnt 1107] Ar.Proc::aggregateDataTableRecords recordName: Actions_actions, aggre. operation: {"33":"max","32":"min"}
DEBUG Actions[2017-08-14 04:27:39] CoreArchive get data for: [3], Actions_actions, blob
DEBUG Actions[2017-08-14 04:27:39] CoreArchive get archiveids for: Actions
DEBUG Actions[2017-08-14 04:27:39] CoreArchive get doneFlag: done
DEBUG Actions[2017-08-14 04:27:39] [Thangnt 1107] Ar.Proc::aggregateDataTableRecords recordName: Actions_actions_url, aggre. operation: {"33":"max","32":"min"}
DEBUG Actions[2017-08-14 04:27:39] CoreArchive get data for: [3], Actions_actions_url, blob
The problem on -3 ago comment might be caused by log ration. Verifying with today's log.
Verifying with 2 periods of log and everything seems ok, sum of action is correct, now hit and action are unified. Visit 'looks' ok. => Will verify on live server.
Pageview is calculated by Actions Plugin and actions is by VisitSummary:
With current code, calculating on local cBimax without rotating log_link_visit_action table give accurate pageview and Action count (verified through number of log lines imported).
Total: 550 visits, 12,000 hits, 12,000 actions, 4.87 G tranffered overall
So, imported: 15:10-15:50: 11k visit 16: 1k visit
rotated all 15h import: db has 1k -> continue import 1.4k in 17h -> db has 2.4k
=> Result hits and actions has gap: Total: 580 visits, 13,400 hits, 13,875 actions
Hourly: (16h used to be 1k before rotated)
Temp archive calculated correctly (17:00 and 17:20 don't have data):
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:31 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:32 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:32 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:32 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:32 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:33 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:33 |
| 2017-08-16 17:10:00 | 2017-08-16 17:19:59 | 2017-08-16 09:34:33 |
Next: reduce log_link_visit_action to 0
Then import 2.8k (hmm, the result seems correct)
Anyway, action and pageview: Total: 773 visits, 16,200 hits, 16,675 actions, 6.61 G tranffered overall previous: 580 visits, 13,400 hits, 13,875 actions
The gap doens't change. The import 0.7k without rotating > confirm in DB.
And the gap open up again: Total: 834 visits, 16,900 hits, 19,749 actions, 6.87 G tranffered overall
Hypothesis: The actions is calculated by VisitSummary as CoreMetrics. I can fix this by simply adding nb_actions to VisitSummary's recalculate and run archive again and the result MUST reflect immediately.
It doesn't change :-1: Suspect: 'hourly' archive check there's no new visit from the last visit (16:00 - 16:10) and obmit the recalculate.
clean all table and re-import log.
1.7k @17:52 Total: 226 visits, 1,700 hits, 1,700 actions, 675.74 M tranffered overall 0.8k@18:18 0.8k@18:25 Still have problem:Total: 1,132 visits, 3,300 hits, 4,160 actions, 1.32 G tranffered overall
nb_actions still wrong:
| nb_actions | 2017-08-16 | 1700 |
| nb_actions | 2017-08-16 | 4160
Missed 1 line of code:
$row->setColumn('nb_actions', $visits[Metrics::INDEX_NB_ACTIONS]);
Run archive without doing anything. Affect immediately.
Total: 1,132 visits, 3,300 hits, 3,300 actions, 1.32 G tranffered overall
Confirm on Live Server
Now live server gives correct result.
Verify again on #32 Fix on anhnongdan/BimaxCore#6
=> Pageviews is an action when Action type is an URL
-> CDN shows no downloads, no outlink, etc..