Problem:
Implement a web crawler that can discover and read pages by branching out from a given seed URL. The crawler should not crawl the same page more than once.
Fix:
Implemented Crawler, Parser, and Page classes.
Crawler traverses web pages in BFS order starting from a root url.
Parser reads the page and identifies
closes #1
Problem: Implement a web crawler that can discover and read pages by branching out from a given seed URL. The crawler should not crawl the same page more than once.
Fix: Implemented Crawler, Parser, and Page classes. Crawler traverses web pages in BFS order starting from a root url. Parser reads the page and identifies
and tagged content. Page represents a web page and contains the root url, the page data, and linked page objects.
Testing: N/A
Notes: N/A