matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.71k stars 2.62k forks source link

API call with outlinkUrl containing an ampersand does return 0 results #19231

Open OliverIost opened 2 years ago

OliverIost commented 2 years ago

Expected Behavior

curl "https://www.mydomain.de/matomo/?module=API&idSite=1&format=json&token_auth=XXX&method=Actions.getOutlink&outlinkUrl=https%3A%2F%2Fwww.testdomain.de%2Ftest%2F%3Ftest%3D321%26test2%3D13&period=day&date=2022-04-26"

(and I know that this was called on the day asked) should return something like

[{"label":"\/test\/?test=321&test2=13","nb_visits":3,"nb_uniq_visitors":3,"nb_hits":3,"sum_time_spent":0,"url":"https:\/\/www.testdomain.de\/test\/?test=321&test2=13"}]%

Current Behavior

Return of []%

Possible Solution

Quick fix: I found that the link itself is saved as https://www.testdomain.de/test/?test=321&test2=13 in the action database but in matomo/plugins/Actions/API.php->getFilterPageDatatableSearch the search comes as https://www.testdomain.de/test/?test=321&test2=13 so nothing can be found.

Cause I do not know where and what exactly is en/decoded, I only added these lines directly before the last return in protected function getFilterPageDatatableSearch($callBackParameters, $search, $actionType) in matomo/plugins/Actions/API.php

if(isset($searchTree[1]) && strpos($searchTree[1],'&')!==false && strpos($searchTree[1],'&')===false) { $searchTree[1]=str_replace('&','&',$searchTree[1]); }

Now it works for me.

Your Environment

bx80 commented 2 years ago

Hi @OliverIost, thanks for reporting this and providing a quick fix. I can recreate the issue and confirm that after the quick fix is applied getOutlink API calls with ampersands in the URL are correctly returned. :+1:

I'll mark this as a bug so our product team can prioritize it.

(github markdown has encoded the ampersands in the provided code, here is a fixed version for reference)

if (isset($searchTree[1]) &&
    strpos($searchTree[1],'&') !== false &&
    strpos($searchTree[1],'&') === false) {
    $searchTree[1]=str_replace('&','&',$searchTree[1]);
}
heurteph-ei commented 2 years ago

Shouldn't the data (in the database) be stored unencoded? Most of the time, data in DB should be as raw as possible (dates in UTC, urls not encoded, HTML not encoded, etc.)

bx80 commented 2 years ago

Shouldn't the data (in the database) be stored unencoded?

:+1: The root cause of the problem may well be that the URL parameter ampersands are being stored encoded when they should be raw. This is something that should be considered when working on this issue.