1112zakaria / web-crawler

Basic web crawler
0 stars 0 forks source link

Implement basic web crawler #2

Closed 1112zakaria closed 1 year ago

1112zakaria commented 1 year ago

closes #1

Problem: Implement a web crawler that can discover and read pages by branching out from a given seed URL. The crawler should not crawl the same page more than once.

Fix: Implemented Crawler, Parser, and Page classes. Crawler traverses web pages in BFS order starting from a root url. Parser reads the page and identifies

and tagged content. Page represents a web page and contains the root url, the page data, and linked page objects.

Testing: N/A

Notes: N/A