josephleblanc / web_crawler

Crawl a given royalroad seed page for story text
1 stars 0 forks source link

Add functionality - update downloaded files #3

Open josephleblanc opened 2 years ago

josephleblanc commented 2 years ago

Now that the program can crawl multiple pages, a next good step is to make it possible to update currently downloaded files without having to redownload everything already in /web_novels/. This will involve:

It might be better to include a last_crawled variable in the page template rather than compare a list of crawled pages, so the update only occurs if there is a new link to follow to the next page. This will definitely work for royal road, but I'm unsure how well it will generalize.