Closed pro2s closed 2 years ago
Also it fixes an issue at this line https://github.com/JefferyHus/es6-crawler-detect/blob/451daf91effdcf13d13ba787ef40569c8dce010c/src/lib/crawler.js#L103 when in request there are no user agent headers e.g. curl -A "" https://...
Can you please share your code so I can base that on some test samples?
Hi @JefferyHus I added additional test, and for current version of Crawler is failed.
it('should identify the crawler from request headers with exact pattern', async () => {
crawler = new Crawler({
headers: { 'user-agent': 'b0t', accept: '*/*' },
});
assert.strictEqual(crawler.isCrawler(), true);
});
In my code we use Crawler like this:
const detector = new Crawler(request);
if (detector.isCrawler()) {
console.log('bot');
} else {
console.log('user');
}
And currently when I check server response with curl:
For curl -I -A http://localhost:3000
I get an error because in this case userAgent is undefined.
TypeError: Cannot read properties of undefined (reading 'replace')
at Crawler.isCrawler (/node_modules/es6-crawler-detect/src/lib/crawler.js:103:19)
For curl -I -A b0t http://localhost:3000
I get user
in console, but we have ^b0t$
rule.
For curl -I http://localhost:3000
I get bot
in console, because an curl rule is matched.
Hi @JefferyHus I added additional test, and for current version of Crawler is failed.
it('should identify the crawler from request headers with exact pattern', async () => { crawler = new Crawler({ headers: { 'user-agent': 'b0t', accept: '*/*' }, }); assert.strictEqual(crawler.isCrawler(), true); });
In my code we use Crawler like this:
const detector = new Crawler(request); if (detector.isCrawler()) { console.log('bot'); } else { console.log('user'); }
And currently when I check server response with curl: For
curl -I -A http://localhost:3000
I get an error because in this case userAgent is undefined.TypeError: Cannot read properties of undefined (reading 'replace') at Crawler.isCrawler (/node_modules/es6-crawler-detect/src/lib/crawler.js:103:19)
For
curl -I -A b0t http://localhost:3000
I getuser
in console, but we have^b0t$
rule. Forcurl -I http://localhost:3000
I getbot
in console, because an curl rule is matched.
Thanks, I will check this our and from the code I would just see a for....
instead of map, one reason is so that the event loop awaits the loop to resolve then continue the next tick
Ok, I will try to use for...
instead of map
Fixed map
and added test for empty user agent header
User agent string gets
undefined
in front and space after, when user agent getting from headers on create crawler detector instance. For example'user-agent': 'b0t'
goes toundefinedb0t