Open Kenny690 opened 7 years ago
There are a number of issues related to character encoding that need addressing including regexp handling, and logging. Will look at this for the next version (v4.2/3).
Also, search term blocking not working when URL contain UTF-8 encoded string (%xx).
But bannedregexpurllist working well with UFT-8, for example yandex.ru\/search\/.*text=хер
I have a little suggestion. Don't know if it's any good thou, cause I'm a little dum-dum. :D What about just adding ch_isiphost.comp(",[a-z|A-Z|а-я|А-Я].");
instead of ch_isiphost.comp(",[a-z|A-Z].");
In https://github.com/e2guardian/e2guardian/blob/v5.1/src/NaughtyFilter.cpp ?
I found one solution that works for Russian char sets: see issue #591
Hej. Even Dansguardian has this problem, that I can't see searching requests in access.log if they were in Russian. I see something like this:
http://search.skydns.ru/search/?r=1&query=%D0%AD%D0%BB%D0%B5%D0%BA%D1%82%D1%80%D0%BE%D0%BD%D0%BD%D0%B0%D1%8F+%D1%82%D0%B5%D1%82%D1%80%D0%B0%D0%B4%D1%8C+%D0%BF%D0%BE+%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%BC%D1%83+%D1%8F%D0%B7%D1%8B%D0%BA%D1%83+%E2%84%962+1+%D0%BA%D0%BB%D0%B0%D1%81%D1%81+21+%D0%B2%D0%B5%D0%BA
Except of this:
http://search.skydns.ru/search/?r=1&query=Электронная+тетрадь+по+русскому+языку+№2+1+класс+21+век
Is there a way to decode URLs for the log file?