ashley-evans / how-many-buzzwords

Some buzzwords are incredibly overused, a simple tool tool to find the biggest culprits
2 stars 0 forks source link

Update crawl service to map subdomains to same domain in results #381

Open ashley-evans opened 2 years ago

ashley-evans commented 2 years ago

Value Added

Consolidates results for different subdomains against the same overall domain. Enables crawling of links that are on different subdomains

Description

Currently the crawl service will only crawl pages that are on the exact same hostname as the provided base URL, therefore, any links on a site that reference a different subdomain (www. etc.) will not be crawled.

The crawl service should be updated to enable the crawling of any page that is on the same domain as the base URL. Crawls against different subdomains should update the known URLs for the overall domain in DynamoDB

Acceptance Criteria

AC01

AC02

AC03

ashley-evans commented 2 years ago

Can use: https://www.npmjs.com/package/tldts to obtain domain name from URL