Open mkrakiewicz opened 7 years ago
Thanks for the report @mkrakiewicz
It would be super helpful if you could submit a Pull request with a new System test that shows the bug. This way we can see it reproduced and it's much easier for us to fix it afterwards. Let me know... thanks
Hi, I have a problem with "Search Keywords with No Results" report. Piwik version: 2.16.2.
Problem:
Steps to reproduce:
For some specific day (1 day is enough) have enough data so that this method's: https://github.com/piwik/piwik/blob/2.16.2/plugins/Actions/API.php#L280 resulting DataTable is full and is limited to 499 rows (according to the setting datatable_archiving_maximum_rows_actions = 500).
Have records so that this condition below is fulfilled so that rows with `28`= 1 (`28` is 'INDEX_SITE_SEARCH_HAS_NO_RESULT') are spread evenly in the result set:
CASE WHEN (MAX(log_link_visit_action.custom_var_v5) = 0 AND log_link_visit_action.custom_var_k5 = '_pk_scount') THEN 1 ELSE 0 END AS `28`
. Here is the full archiving query I believe is used for aggregation: https://gist.github.com/mkrakiewicz/e3219368fa32a1a7491d49904a9418cfThen, resulting rows are filtered AFTER the result set is "clipped" to 500 records: https://github.com/piwik/piwik/blob/2.16.2/plugins/Actions/API.php#L286
If I set datatable_archiving_maximum_rows_actions = -1, then the report is correct (after choosing a segment there are less keywords than in "All results" setting). Also several missing keywords appear. However then in above API method the DataTable row count is way more than default 500 (thousands in my case).
Conclusion: Because the archiving query has several subqueries/groupings, there can be more records matching `28`= 1 than DataTable allows. Also, because each keyword is listed, this column cannot be summed in other way I believe. That means, even if in DataTable there are 500 rows, in the report there can be a few keywords or 0 records after filtering is done, while in reality there are several keywords which were cut off from the result.
Proposed fix:
Remove the filtering in API method.
Apply filtering on query level during archiving and have separate report for that. Currently
Actions_sitesearch
is used both bygetSiteSearchNoResultKeywords
andgetSiteSearchKeywords
.