neocities / neocities

Neocities.org - the web site. Yep, the backend is open source!
https://neocities.org
Other
1.43k stars 136 forks source link

Excessive accesses from Singapore #523

Closed prino closed 4 months ago

prino commented 4 months ago

My website deals with hitchhiking, and Singapore is not known to be a country with a big (if any?) hitchhiking culture. The poages contain a link to "flagcounter.com" to count access, and after a short lull, accesses from Singapore are again exceeding all normal traffic, despite (hopefully) having blocked every bot/crawler that operates from Singapore, in my robots.txt, included below.

User-agent: ia_archiver
Disallow: /

User-agent: AhrefsBot
Disallow: / 

User-agent: AhrefsSiteAudit
Disallow: / 

User-agent: CCBot
Disallow: /

User-agent: GPTBot
Disallow: / 

User-agent: FacebookBot
Disallow: / 

User-agent: OmgiliBot
Disallow: / 

User-agent: BingBot
Disallow: / 

User-agent: Baiduspider
Disallow: /

User-agent: AspiegelBot
Disallow: /

User-agent: PetalBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Bytedance
Disallow: /

Can you please do something about this? And if there are bot's I've missed, and I'm sure there are, feel free to add them to the replies!

kyledrake commented 4 months ago

Hello,

This is not something we can really address, because we don't block traffic to sites and I don't have any way to really stop countries or locations from accessing your site. It's quite common for sites to be indexed, spidered, or to get popular in random locations by users even, so I wouldn't worry too much about it. I'm going to close this because there's nothing I can really do for this, but thanks for the report.

prino commented 4 months ago

Kyle,

You could at least allow users to look at the logs for detailing access to their own sites), so that they might be able to use robots.txt to block these spiders. Probably 90 to 95% of my traffic is organic, and suddenly some effing Singaporese bot starts screwing things!

Or even provide a ready-to-run robots.txt (sample-robots.txt) that lists all bots, and people can comment out those that they are happy to accept? After all, you've also suddenly added an ".attrib_store" directory...

Robert

On Mon, 24 Jun 2024 at 22:56, Kyle Drake @.***> wrote:

Hello,

This is not something we can really address, because we don't block traffic to sites and I don't have any way to really stop countries or locations from accessing your site. It's quite common for sites to be indexed, spidered, or to get popular in random locations by users even, so I wouldn't worry too much about it. I'm going to close this because there's nothing I can really do for this, but thanks for the report.

— Reply to this email directly, view it on GitHub https://github.com/neocities/neocities/issues/523#issuecomment-2187565059, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHK6OM3Q2F6WDJGBAQZVO3ZJCPZDAVCNFSM6AAAAABJZ5O3I2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBXGU3DKMBVHE . You are receiving this because you authored the thread.Message ID: @.***>

-- Robert AH Prins robert(a)prino(d)org The hitchhiking grandfather https://prino.neocities.org/index.html Some REXX code for use on z/OS https://prino.neocities.org/zOS/zOS-Tools.html