matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.68k stars 2.62k forks source link

"Inverse" page tracking - to find bad pages #10519

Open hpvd opened 8 years ago

hpvd commented 8 years ago

To make your website better it would be not only interesting which pages are visited, but on the other site: pages are never or only very rarely visited.

This may have several reasons:

To find these pages it would be very helpful to have the possibility to find & show them in an automatic way via "Invers page tracking"

one way may be

the next step of this would be "Invers" Custom Event tracking - to find bad elements/functions #5186

hpvd commented 8 years ago

a bad "page" could be:

gaumondp commented 8 years ago

As long as it's possible to remove/diable the feature. If you "only" have 500 pages it's not too bad I presume but I got 11 000 different web pages and maybe 4000 PDF so yes, we have many "unviewed" or low-visited pages over time...

hpvd commented 8 years ago

of course it should be only optional - maybe even as an plugin. In addition it may make sense to have a possibility to restrict (white or blacklist) a path within url...

hpvd commented 8 years ago

as first approach crawling could be easily done by using a standard (google-)sitemap. These are available on most websites and contains the urls of all pages in a standardized xml format

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.example.net/outsite/black-cats.html</loc>
    <lastmod>2015-09-22</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.8</priority>
  </url>
  <url>
    ....
  </url>
</urlset>

For this one only need a setting for the url of it (most times it's http://www.example.net/sitemap.xml)

hpvd commented 8 years ago

Since the comparison should only be done with very low frequency (e.g.once a month) or on request (e.g. via Button click to schedule it to make it next night) there shouldn't be a big performance problem at all...

hpvd commented 8 years ago

In addition a setting for comparison period is needed: e.g. (not)visited within last month, last year, for all data... => hmm maybe this could be directly set by using segment editor?

hpvd commented 8 years ago

could e.g. be part of a new plugin which provides a new menu point: "Find the bad" which contains the functions: "Invers" page tracking - to find bad pages #10519 "Invers" Custom Event tracking - to find bad elements/functions #5186 "Invers" Content tracking - to find bad pieces of content #10520