omrilotan / isbot

🤖/👨‍🦰 Detect bots/crawlers/spiders using the user agent string
https://isbot.js.org/
The Unlicense
905 stars 74 forks source link

PhantomJS was not recognised #208

Closed rigens closed 1 year ago

rigens commented 1 year ago

User Agent String

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1

Reproduce

Node version v14.21.3

let ua = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1';
console.log(isbot(ua)) //3.6.1 - true, 3.6.6 - false

Is this library really production ready?

omrilotan commented 1 year ago

Sure, we can support this substring. Just remember that PhantomJS supports any user agent string (the default string is "phantomjs"). So we'll support those who include PhantomJS in the user agent string.

omrilotan commented 1 year ago

Would you care to review it? #209

rigens commented 1 year ago

Would you care to review it? https://github.com/omrilotan/isbot/pull/209

Yes, that's exactly what we need

It is unclear why it was needed to change the phantom to ^phantom at all...

omrilotan commented 1 year ago

Since the default user agent string of PhantomJS is "phantomjs" it is more efficient to check it from the start.

This repository works with 4 different databases to remain updated and has rules to keep efficiency optimal. This is done so we catch all the popular bot user agent strings, but do not sacrifice timing.

I'd be happy to know where you encountered this user agent string.

romis2012 commented 1 year ago

We have the same problem. Please publish new release.

...it is more efficient to check it from the start

First of all, you need to ensure correctness and backward compatibility, and only then take care of performance

omrilotan commented 1 year ago

Correctness is ensured across a total of 5 databases, total 32,614 crawler user-agent strings