JayBizzle / Crawler-Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
https://crawlerdetect.io
MIT License
1.97k stars 256 forks source link

33 new possibilities #351

Closed rentalhost closed 1 year ago

rentalhost commented 4 years ago

I have found some new possibilities, but I not right what should be added or not, or even if it is really a crawler. So I am creating this issue with checkboxes to you decide. I will make a PR after the decision.

Dangerous possibility (eg. cracker):

The quoted user-agents are as was received by my logger.

High probability:

Microsoft Office happen when you copy some content from web to it, then in some cases it will download content from original page. Same for Mashup.

Medium probability:

Can't found what it mean, but not seems to be a real browser.

Low probability:

I don't know what mean CFNetwork, but is related to Apple. Outlook and OC is very similar to Microsoft Office case.

JayBizzle commented 4 years ago

Thanks for this. Go ahead and create a PR for the agents i have ticked. Will look into the other at a later date 👍

mtshare commented 4 years ago

Please also add "LieBaoFast" Chinese scrapers.

Take a look at this: https://www.johnlarge.co.uk/blocking-aggressive-chinese-crawlers-scrapers-bots/

JayBizzle commented 4 years ago

@mtshare would you like to submit a PR to add that bot?

Abhirup-99 commented 4 years ago

Is the corresponding pr merged?

JayBizzle commented 4 years ago

Is the corresponding pr merged?

No PR was ever submitted 😔

rentalhost commented 4 years ago

Sorry, I ended up waiting for the analysis of the other User Agents before sending the PR. Anyway, if someone can send me a PR, I appreciate it (I'm a little bit in trouble now).

Abhirup-99 commented 4 years ago

If no one is assigned, can I push a pr?

Abhirup-99 commented 4 years ago

Should the take into consideration all the user agents assigned over here?

JayBizzle commented 4 years ago

Just the ones that have been ticked 👍🏻