chrisakroyd / robots-txt-parser

A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.
MIT License
12 stars 8 forks source link

canCrawl() returns incorrect value for middle paths #15

Open ssaumya1711 opened 8 months ago

ssaumya1711 commented 8 months ago

I am testing with the following URL: https://play.google.com/store/apps/details?id=com.trainerize.evbcoaching According to https://play.google.com/robots.txt, it allows "/store/apps/" but disallows "/apps" according to the rule: Disallow: /apps. But I get Crawlable: false rather than "true". Can you please check the issue ?

per Google interpretation of robots.txt (https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt#syntax):

Screenshot 2024-01-14 at 11 37 32 PM