Open tsteur opened 9 years ago
uuh good find.- that's quite important!
Seems like we are experiencing this too -> http://forum.piwik.org/t/update-to-2-15-changed-average-visit-duration-and-time-on-page/16744
I think the issue I mentioned here has been an issue for a long time and not only since the last update. From which Piwik version did you update? I presume the problem you are describing might be actually a different one
Marking this issue as duplicate of #9199 - which was renamed to include in its scope this bug
Edit: re-opened this issue as it may be easier to fix this one rather than #9199
@tsteur : from 2.14.3 to 2.15.0 And this is just one of 800 Sites tracked in one Piwik instance. (it has effect on the other 799 too)
Main Question is: which counting is correct?
@SR-mkuhn this particular issue has been buggy for quite a while and not only since last update I think.
@SR-mkuhn maybe create a new issue for your problem and describe it there
I've also encountered this issue recently. And I've found that there is problem with Metric calculation:
Time spent on site is defined as sum_time_spent
and calculated as SUM():
https://github.com/piwik/plugin-CustomDimensions/blob/master/Archiver.php#L164
SUM()
function in SQL databases omits records containing NULL values. Later average time spend on time is calculated by dividing this sum by number of visits nb_hits
( https://github.com/piwik/piwik/blob/master/plugins/Actions/Columns/Metrics/AverageTimeOnPage.php#L39 ) which is calculated as COUNT(*)
( https://github.com/piwik/piwik/blob/master/plugins/Actions/Archiver.php#L359 ).
The problem is that COUNT(*) counts all rows, even those containing NULL value. So average value isn't average at all. Solution would be use of SUM(COALESCE(sum_time_spent, 0))
which will count NULL values or introducing nb_hits_with_time_spent
as COUNT(sum_time_spent)
, then using it to divide as @tsteur said.
I’ve just encountered the same issue. Checking at the Visitor Profile, I can see that if I look at interactions where we have a page with lots of tracking events occurring, the Page View event gets a minimal time whereas the events are given times between each interaction. As such the user could be interacting on a page for say a minute or more, triggering numerous events, but the page dwell time would be still close to zero. Surely this is a major bug. This means that any page which has subsequent events occurring will have a incorrect dwell time.
it is causing the Time on page to be wrong, for any page tracking events.
Possibly there is a strong relationship to: we have no "time on url": Piwik handels events as leaving page (at least in visitor log) #11546
Has it been already solved? We had Piwik 2.something, now we are in upgrade process to matomo 3.7. and I am wondering if it will be correct. I have checked the previous data in database. Zou can easily see, when filter one specific idvisit in piwik_log_link_action_table and you will see, that every event will close the time on pageview. Especially if you are usinng some events like formSeen, bannerImpression etc. you will understand that it is not correct. And also time spent with some events is soo funny. Thank you also for pointing me to some another issue etc.
As the issue is still open I don't think anything has been solved here yet AFAIK. @mattab that might be indeed quite important to fix the time on page.
I also encountered this issue getting wrong time on page. Is this planned by anyone? I know it's in the backlog, but it's the older of the - only - two issues labeled as Major + Bug
I'm being hit by this problem, too - my "AVG. TIME ON PAGE" numbers are coming out as near-zero due to events on the page.
How do we get this bug prioritised for fixing, please?
In case it's helpful to anyone, in my local Matomo deployment I unashamedly hacked my ./plugins/Actions/Archiver.php file and commented-out the line which restricts by getWhereClauseActionIsNotEvent
:
/**
* Time per action
*/
protected function archiveDayActionsTime($rankingQueryLimit)
{
$rankingQuery = false;
if ($rankingQueryLimit > 0) {
$rankingQuery = new RankingQuery($rankingQueryLimit);
$rankingQuery->addLabelColumn('idaction');
$rankingQuery->addColumn(PiwikMetrics::INDEX_PAGE_SUM_TIME_SPENT, 'sum');
$rankingQuery->partitionResultIntoMultipleGroups('type', array_keys($this->actionsTablesByType));
$extraSelects = "log_action.type, log_action.name, count(*) as `" . PiwikMetrics::INDEX_PAGE_NB_HITS . "`,";
$from = array(
"log_link_visit_action",
array(
"table" => "log_action",
"joinOn" => "log_link_visit_action.%s = log_action.idaction"
)
);
$orderBy = "`" . PiwikMetrics::INDEX_PAGE_NB_HITS . "` DESC, log_action.name ASC";
} else {
$extraSelects = false;
$from = "log_link_visit_action";
$orderBy = false;
}
$select = "log_link_visit_action.%s as idaction, $extraSelects
sum(log_link_visit_action.time_spent_ref_action) as `" . PiwikMetrics::INDEX_PAGE_SUM_TIME_SPENT . "`";
$where = $this->getLogAggregator()->getWhereStatement('log_link_visit_action', 'server_time');
$where .= " AND log_link_visit_action.time_spent_ref_action > 0
AND log_link_visit_action.%s > 0"
// . $this->getWhereClauseActionIsNotEvent() //include time spent in events as well
;
Informally, this worked for my use case - I haven't given any thought about whether this is a robust solution.
How can this issue still be open after 7 years? Just wondered why the big gap between GA3 (Universal) data and this one could be and found out that his has been discussed a few times. Is there any ohter workaround in the code for that to ignore the users that spent 0 time in the "avg time on page" row?
This is still an issue as currently multiple users are experiencing a problem with this and see a significant difference between GA and Matomo time spent on page.
This has been carefully investigated by one of our heavy users. They tracked down six real visits and found a difference of 6 to 1.
According to behavior >> pages: 9 seconds average time But if you do a manual calculation from Visitors >> Visits log: 54 seconds average time
A different user has reported that Behavior >> Pages report is inconsistent before/after applying a segment. They sent screenshots at https://github.com/matomo-org/matomo/issues/4719#issuecomment-2135980876
I reproduced this error.
Time on page is inaccurate in Behavior >> Pages Time on page is accurate in Visitors >> Visits Log
We have reviewed this Bug in our new triaging process and this has turned out to be a higher priority in comparison to other bugs we have triaged so far and will be aiming to plan a fix for this in Q3 and we will update you on the progress here when we have an update to share.
I hope you're doing well. I’m one of the customers impacted by the issue outlined in ticket #9198 regarding inaccuracies in the "time spent on page" calculations. I noticed that the issue is no longer associated with any upcoming release.
Could you please provide an update on whether this is still being actively worked on and when we might expect a resolution? Accurate time-on-page metrics are critical for our reporting, and I’d like to plan accordingly based on your timeline.
Thank you in advance for your time and attention to this. I look forward to hearing from you.
@nick-myers-dt This is certainly something we want to fix and is a priority for us to look at. Currently we are figuring out the best possible way to implement this. I believe that there is an error in automation somewhere which removed the version and we do tend to adjust our plans depending on the priorities shift but it certainly is something we look to get it in upcoming minor releases.
time_spent_ref_action
divided by the number of visitsnb_visits
. Not all visits havetime_spent_ref_action
though. Insteadsum_time_spent
should be divided by something likenb_hits_with_time_spent
.time_spent_ref_action
wrong. It calculatesvisit_last_action_time - currentTimestamp
butvisit_last_action_time
is updated on any tracking call, meaning also on any hit.To make it a bit more clear let's say there are the following tracking calls
The time spent for first pageview is calculated by the time difference between the event and the pageview, not between the two pageviews. This means for many common scenarios where one triggers a pageview and then an event, search, content impression, ... the time spent information is not accurate.