ckan / ideas

[DEPRECATED] Use the main CKAN repo Discussions instead:
https://github.com/ckan/ckan/discussions
40 stars 2 forks source link

Logging user searches and outcomes to obtain information on user preferences and actions #216

Open gjlawran opened 6 years ago

gjlawran commented 6 years ago

As a site manager I would like to be able to regularly (e.g. monthly) review search queries so that I can determine if I should be encouraging data providers to improve tagging of datasets for discovery or add additional datasets to the catalogue based on interest.

As a part of the log of user queries I would also like to know: what sorting order and filters were used, how many responses were provided to the search query (i.e. count), if the user selected a result from the search which one was it from the top of list (number) and what was the package id.

dkelsey commented 6 years ago

I wrote something to capture ides in the Backlog:

Data Search Enhancements and Find-Ability Enhancements

gjlawran commented 6 years ago

This extension ckanext-searchhistory by @amercader seemed to be attempting to address many of these requirements - however it doesn't look compatible with latest CKAN release.

torfsen commented 6 years ago

@dkelsey has already mentioned ckanext-discovery, which already has some infrastructure to store user searches for its search suggestions feature. It currently does not offer advanced search statistics, but PRs are welcome.

gjlawran commented 6 years ago

@torfsen @dkelsey thanks for the reference to ckanext-discovery

Improving search utility / results by providing alternate information (e.g. similar records, tag clouds, search suggest, and adjusting Solr configuration) is another objective beyond tracking terms and methods that users are searching with - and determining if they appear to be successful - based on returned count and if they select a record from a search.

Will have to have look at the mechanism for search query storage in ckan-discovery - however a session based approach - to tie search to user session may be ultimately necessary to get what they started searching with and where they ended their session. Of course Google Analytrics would provide that user path - if people weren't so found of ad-blockers - but again as configured the actual terms used would be not captured.