Closed doxycomp closed 6 days ago
Just in case, sombody else is struggling with this, for the moment we added some lines to the top of our .htaccess, to block the facebook/meta crawlers:
RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit. [OR] RewriteCond %{HTTP_USER_AGENT} ^meta-externalagent. RewriteRule .* - [F,L]
Kind regards :)
To whomever added this bot as 'good'.. are you serious?
./globalblacklist.conf:BrowserMatchNoCase "(?:\b)developers.facebook.com(?:\b)" good_bot ./globalblacklist.conf:BrowserMatchNoCase "(?:\b)facebookexternalhit(?:\b)" good_bot ./globalblacklist.conf:BrowserMatchNoCase "(?:\b)facebookplatform(?:\b)" good_bot
To whomever added this bot as 'good'.. are you serious?
./globalblacklist.conf:BrowserMatchNoCase "(?:\b)developers.facebook.com(?:\b)" good_bot ./globalblacklist.conf:BrowserMatchNoCase "(?:\b)facebookexternalhit(?:\b)" good_bot ./globalblacklist.conf:BrowserMatchNoCase "(?:\b)facebookplatform(?:\b)" good_bot
Many people use Facebook's ad platforms and by blocking these you cannot track your ad stats / clicks. You are welcome as the blocker is designed to do, simply block them in the custom block lists included in the project which will take you all of 30 seconds to do and will override this setting.
Removal Request?
Please List the User-Agent string or Referrer to be added/removed
/globalblacklist.conf:BrowserMatchNoCase "(?:\b)facebookexternalhit(?:\b)" good_bot
Facebook is not a good bot and ignoring robots.txt and spamming the server with requests.
For Additions: Please include a log sample 3-5 lines is adequate
It defaults to good_bot in globalblacklist.conf and seemingly cannot be set as bad_bot in blacklist-user-agents.conf
Thank you!