hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.74k stars 531 forks source link

Index Error for some recipes on happyfoodie.co.uk #594

Closed JimmyStrings closed 2 years ago

JimmyStrings commented 2 years ago

Recipe URLs that error: https://thehappyfoodie.co.uk/recipes/ottolenghi-middle-eastern-mac-n-cheese-with-zaatar-pesto/ https://thehappyfoodie.co.uk/recipes/yotam-ottolenghis-chocolate-tarts-with-tahini/ https://thehappyfoodie.co.uk/recipes/oyster-mushroom-tacos-with-all-or-some-of-the-trimmings/

Attempted code:

from recipe_scrapers import scrape_me
scraper = scrape_me('https://thehappyfoodie.co.uk/recipes/ottolenghi-middle-eastern-mac-n-cheese-with-zaatar-pesto/')
print(scraper.ingredients())

Error Message: IndexError: list index out of range

Note: The problem seems to be that these page contains multiple sub-headings in the ingredient list. Other recipes on this website do not have multiple sub-headings and the scraper works fine: e.g. https://thehappyfoodie.co.uk/recipes/lemon-courgette-linguine/

Python version: 3.8 Operating System: Windows 10

jayaddison commented 2 years ago

Thanks @JimmyStrings - nice finds!

Could you open one GitHub issue for each of the affected websites? Some of them look easier to handle than others, and it's useful to have details of each problem as an issue so that fixes in pull requests can be linked to them.

JimmyStrings commented 2 years ago

Hi @jayaddison - updating this issue to contain only 1 issue. Will add other issues separately.

jayaddison commented 2 years ago

Thanks again @JimmyStrings! Much appreciated :+1:

vabene1111 commented 2 years ago

fixed by ignoring headers and spacers from the table. PR will follow tomorrow