Open-Web-Analytics / Open-Web-Analytics

Official repository for Open Web Analytics which is an open source alternative to commercial tools such as Google Analytics. Stay in control of the data you collect about the use of your website or app. Please consider sponsoring this project.
http://www.openwebanalytics.com
GNU General Public License v2.0
2.49k stars 453 forks source link

Updating the bot database #709

Closed SANEK-LITWIN closed 3 years ago

SANEK-LITWIN commented 3 years ago

Hello! Sorry for my English))

Tell me please. How to update the database of bots to exclude them from statistics? Screenshot_1

It's just that there are a lot of bot clicks in the statistics. If we take well-known analytics services, these clicks are not counted there. I made logging on the sites and you can see from them all visits to bots. Is it possible to add, for example, user agents of bots, which I need to exclude?

You can, of course, exclude bots by IP address, well, of course, it's just not realistic. Then you need to add new IP addresses every day and it will never end.

SANEK-LITWIN commented 3 years ago

By the way, here's an array with user-agents of bots, which I collected myself. It may be possible to implement it into the functionality.

$robots = array('crawl', 'curl', 'host', 'localhost', 'java', 'libcurl', 'libwww', 'lwp', 'perl', 'php', 'wget', 'search', 'slurp', 'robot', 'yandexbot', 'yandexaccessibilitybot', 'yandexmobilebot', 'yandexdirectdyn', 'yandexscreenshotbot', 'yandeximages', 'yandexvideo', 'yandexvideoparser', 'yandexmedia', 'yandexblogs', 'yandexfavicons', 'yandexwebmaster', 'yandexpagechecker', 'yandeximageresizer', 'yandexadnet', 'yandexdirect', 'yadirectfetcher', 'yandexcalendar', 'yandexsitelinks', 'yandexmetrika', 'yandexnews', 'yandexnewslinks', 'yandexcatalog', 'yandexantivirus', 'yandexmarket', 'yandexvertis', 'yandexfordomain', 'yandexspravbot', 'yandexsearchshop', 'yandexmedianabot', 'yandexontodb', 'yandexontodbapi', 'yandexturbo', 'yandexverticals', 'yadirectbot', 'yandex/1', 'googlebot', 'googlebot-image', 'mediapartners-google', 'adsbot-google', 'apis-google', 'adsbot-google-mobile', 'adsbot-google-mobile', 'googlebot-news', 'googlebot-video', 'adsbot-google-mobile-apps', 'google-site-verification', 'google mobile', 'google adsense', 'google mobile adsense', 'google adsbot', 'chrome-lighthouse','mail.ru_bot', 'bingbot', 'accoona', 'ia_archiver', 'ask jeeves', 'omniexplorer_bot', 'w3c_validator', 'webalta', 'yahoofeedseeker', 'yahoo!', 'ezooms', 'tourlentabot', 'mj12bot', 'ahrefsbot', 'searchbot', 'sitestatus', 'nigma.ru', 'baiduspider', 'statsbot', 'sistrix', 'acoonbot', 'findlinks', 'proximic', 'openindexspider', 'statdom.ru', 'exabot', 'spider', 'seznambot', 'obot', 'c-t bot', 'updownerbot', 'snoopy', 'heritrix', 'yeti', 'domainvader', 'dcpbot', 'paperlibot', 'stackrambler', 'msnbot', 'msnbot-media', 'msnbot-news', 'apache-httpclient', 'linksmasterrobot', 'linkstats', 'cheesebot', 'emailcollector', 'litnetbot', 'ltx71', 'wordpress.com mshots', 'blexbot', 'petalbot', 'nimbostratus-bot', 'applebot', 'colly');

SANEK-LITWIN commented 3 years ago

I found how to update an array with bots. Open file browscap.php (path: /modules/base/classes/), find the function robotRegexCheck() and add your keys to array $robots: image