matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.97k stars 2.66k forks source link

when using content impression, Site Search result pages count is too high #7670

Open helge79 opened 9 years ago

helge79 commented 9 years ago

Piwik seems to track queries multiple times when using content impressions. I was able to reproduce this issue using the following HTML with snippets from the documentation:

<html>
<head><title>Hello World</title></head>
<body>
<a href="/purchase" data-track-content data-content-name="My Product Name" data-content-piece="Buy now">
    translate('Buy it now')
</a>
<!-- Piwik -->
<script type="text/javascript">
  var _paq = _paq || [];
  _paq.push(['trackPageView']);
  _paq.push(['enableLinkTracking']);
  _paq.push(['trackAllContentImpressions']);
  (function() {
    var u="//[piwikurl]/";
    _paq.push(['setTrackerUrl', u+'piwik.php']);
    _paq.push(['setSiteId', 15]);
    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
    g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
  })();
</script>
<noscript><p><img src="//[piwikurl]/piwik.php?idsite=15" style="border:0;" alt="" /></p></noscript>
<!-- End Piwik Code -->
</body>
</html>

I made one request to this HTML using query parameter "test" (index.html?q=test) which led to this visitor log in Piwik (sorry for the German wording):

visitor_log

The search report shows 2 visited result pages:

search_report

The database table piwik_log_link_visit_action also has 2 entries:

select count(*) from piwik_log_link_visit_action where idvisit = 8237974;
+----------+
| count(*) |
+----------+
|        2 |
+----------+

If I add more content blocks to the HTML, the result pages will increase appropriately.

Piwik Version is 2.12.1, PHP (unfortunately still) 5.3.10-1ubuntu3.14

Let me know if you need anything else.

mattab commented 9 years ago

Thanks for the report @helge79 - we will investigate / reproduce issue.

barbushin commented 9 years ago

I just reproduced that issue. Going to find out how to fix it.

barbushin commented 9 years ago

@helge79 Regarding http://developer.piwik.org/guides/content-tracking content tracking for elements with attribute data-track-content is occured when element is displayed. So it does not matter if somebody click or not on that link, it will be tracked anyway.

So there are 2 actions in HTML:

  1. Page view - _paq.push(['trackPageView']);
  2. Content view tracking data-track-content data-content-name="My Product Name"

Looks like it works as expected.

@mattab Am I right?

mattab commented 9 years ago

I think Content Tracking works as expected indeed :+1:

But there is still a bug, as reported above: "The search report shows 2 visited result pages" when there was only one search result page (the Content tracking request should not count as a search result page).

Maybe the problem is around Site Search archiving, ie. https://github.com/piwik/piwik/blob/2.14.1/plugins/Actions/Archiver.php#L114-118

where it could maybe also restrict the CASE statement to Actions that are Page URL, something like this:


        $selectPageIsFollowingSiteSearch = ",
                SUM( CASE WHEN log_action_name_ref.type = " . Action::TYPE_SITE_SEARCH . " AND log_action_url.type = " . Action::TYPE_PAGE_URL . "
                      THEN 1 ELSE 0 END)
                    AS `" . PiwikMetrics::INDEX_PAGE_IS_FOLLOWING_SITE_SEARCH_NB_HITS . "`";