matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.93k stars 2.66k forks source link

Internal site search decoding bug #15060

Open KALLIAK opened 5 years ago

KALLIAK commented 5 years ago

We have site in russian language in windows-1251 codepage and all working fine except internal site serch keywords.

E.g. when I try to find ‘окрб’ this keyword turns to url encoded ‘q=%EE%EA%F0%E1’ and then in database it turns to some wierd encoded string ’ îêðá’.

There is small 2-HTML page site (windows-1251) where the problem can be reproduced: test.zip

Matomo 3.11.0 Thanks!

tsteur commented 5 years ago

@mattab moved this to 4.1.0 as it's not a regression.

sgiehl commented 3 years ago

I've just checked that one. For me it seems to work as expected. Maybe that is meanwhile fixed with utf8mb4 support.

tsteur commented 3 years ago

Hi @KALLIAK it looks like this might be fixed. If you can still reproduce this using Matomo 4 after migrating the DB to UTF8MB4 then please comment here and we will be happy to reopen the issue and investigate again.

KALLIAK commented 3 years ago

Hi @tsteur Just tested my test.zip with the freshly installed Matomo 4.0.5 and the problem remains. This problem has nothing to do with the database, because the GET request to the tracker passes already corrupted keyword, i.e. the problem is in matomo.js, namely in the call "locationHrefAlias = safeDecodeWrapper(locationArray[1])". safeDecodeWrapper("%EE%EA%F0%E1") returns "îêðá", but it "must" return "окрб". When I remove "safeDecodeWrapper" and leave "locationHrefAlias = locationArray[1]", the problem goes away, but it is not a solution as far as I understand. But for more than a year I solve the problem this way, with each patch I have to mess with yuicompressor-2.4.8.jar, etc.)

sgiehl commented 3 years ago

@KALLIAK Which browser are you using? I had tried your example, and as long as the encoding of the file is windows-1252 and served as such it worked at least in Chrome. When you are viewing the page in your browser is the page title displayed correctly?

KALLIAK commented 3 years ago

We have windows-1251 encoding. We tried all famous browsers. When I try to find "тест", I get this url: "matomotest/find.html?q=%F2%E5%F1%F2" And then it turn to this request: matomo_1251_bug

sgiehl commented 3 years ago

Checked that again and now I was able to reproduce.

sgiehl commented 3 years ago

this would actually be solved with #16628