gobfink / Groceries

GNU General Public License v3.0
0 stars 0 forks source link

Implement spider for Lidl #16

Closed gobfink closed 4 years ago

gobfink commented 4 years ago

Work on creating a spider to download webpages for Lidl

gobfink commented 4 years ago

Current plan -

Start by navigating to lidl/products

Then collect all hrefs from side menu panel. Then click on each one and expand the view more for each one

gobfink commented 4 years ago

Got it working - now is able to scrape groceries.

Basically created a lua script that clicks the view-more button if it doesn't have an href, and scrolls to the bottom of the page. It then determines if it has selected the first item in the list. If it has it adds the urls to a list. If it hasn't it scrapes the items on the page. When it finishes it pops the next item from the list adds it to the processed list and processes it.

It sets the headings based on the title of the different options in the pane, and the subheadinng is based on the title of the active pane.