jakopako / goskyr

A configurable command-line web scraper written in go with auto configuration capability
GNU General Public License v3.0
32 stars 4 forks source link

Improve speed of DynamicFetcher.Fetch for pagination #253

Open jakopako opened 7 months ago

jakopako commented 7 months ago

Currently, with dynamic websites, when page number N is fetched the scraper fetches the initial url and then "clicks" on the next-page button N times. So page 0 means fetch and click 0 times, page 1 fetch and click 1 time, page 2 fetch and click 2 times. Since there is a short delay between each click, fetching many pages can take quite some time.

This can probably be improved if we store the intermediate state somehow. Then for each page we'd only need to "click" once on the next-page button.