viragumathe5 / Web-crawler-for-wayback-machine

The repository is for the submission of the patch for the Internet Archives for GSoC 2020
Apache License 2.0
2 stars 1 forks source link

Web-crawler-for-wayback-machine

The repository is for the submission of the patch for the Internet Archives for GSoC 2020

crawler.py

This file contains the scraper for the Pixar (took this site for example). The script will just parse the HTML and will just take the image.