Figure out how to handle scraping of multiple pages

AmerPandzo / scrapmeister

Scrap a webpage free.

1 stars 0 forks source link

Figure out how to handle scraping of multiple pages #1

Open AmerPandzo opened 4 years ago

AmerPandzo commented 4 years ago

We need to figure our a way to handle scraping of multiple pages. First thing, or the major thing, for us would be to handle paginated pages.

One idea is to scrap a page for a certain link that contains current page + 1 in the link. Save that link or ParseRule and TargetWebsite while scraping and then start scrap for that TargetWebsite.

AmerPandzo commented 3 years ago

IMPORTANT For JSoup this issue is probably too much to handle. I will probably deprecate this project entirely and implement something similar using Python and Selenium (among others).