We need to figure our a way to handle scraping of multiple pages.
First thing, or the major thing, for us would be to handle paginated pages.
One idea is to scrap a page for a certain link that contains current page + 1 in the link.
Save that link or ParseRule and TargetWebsite while scraping and then start scrap for that TargetWebsite.
IMPORTANT
For JSoup this issue is probably too much to handle. I will probably deprecate this project entirely and implement something similar using Python and Selenium (among others).
We need to figure our a way to handle scraping of multiple pages. First thing, or the major thing, for us would be to handle paginated pages.
One idea is to scrap a page for a certain link that contains
current page + 1
in the link. Save that link orParseRule
andTargetWebsite
while scraping and then start scrap for thatTargetWebsite
.