matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.72k stars 2.63k forks source link

Save HTML Metadata in Database via JS Tracking Client #7233

Open ssipiora opened 9 years ago

ssipiora commented 9 years ago

I'd like to be able to save a pages HTML metadata (Description and Keywords) in separate columns in the database so I could use this data to report on page characteristics.

This content is essentially telling you what the page is about and most sites include this information as part of SEO best practices. For example: Imagine a food site being able to report that their pages with "Healthy" in the keywords have more visits and more time spent on them. This information would help sites sell their Advertisements better and help sites improve their conversion.

I'd definitely want to be able to interact with this data via the API so I could use statistical models to predict what content was most valuable based on this content.

mattab commented 9 years ago

Hi @ssipiora thanks for the suggestion. do you maybe have some suggestion of how this data could be displayed in Piwik, in a way that would be useful to you?

ssipiora commented 9 years ago

Yes, I have a few. The key is that the metadata helps site owners understand how groups of pages are working on their sites.

I see value at the top of the funnel helping people understand which groups of pages help bring people to the site. I also see value deeper in the funnel when tied to Goals. Being able to look at the groups of pages that led to a conversion. Yes, knowing the straight path of pages is important. But if someone has a content heavy site they could use this information to cross link pages more effectively so they could expose additional content to the user in the right place.

Example reports/Graphics: Show a tag cloud of the metadata from the site entry pages. Below the tag cloud show a list of the terms used to generate the tag cloud in descending order based on their frequency. If a user clicked on the term, they should be presented with a list of pages where that term appears. This helps them not only see pages where the metadata exists, but potentially identify pages where they've forgotten to add the right metadata terms. It would also be good to allow the user to filter by inbound source. For example, show the difference between Google, Bing and Yahoo visitors. Paid vs. Organic, Facebook, Twitter etc. It helps a site owner segment their audience, know the differences between them and target more effectively.

The same report could also apply to goals. Show a tag cloud of the metadata from pages a visitor visited who eventually converted on a goal you'd set up.

It would also be good to be able to view geographic differences (regions or countries) in how your content is being consumed.

I've attached several text analysis visualizations. Most are variants of tag clouds. But, you might get some ideas from them. meta8 meta1 meta2 meta3 meta4 meta6 meta7