Madewithlau.com recipe support

hhursev / recipe-scrapers

Python package for scraping recipes data

MIT License

1.7k stars 518 forks source link

Madewithlau.com recipe support #1015

Closed kdliu86 closed 5 months ago

kdliu86 commented 6 months ago

Currently cannot scrape madewithlau.com

https://www.madewithlau.com/recipes/siu-yuk-crispy-pork-belly

`returned scraper data for reference:

{ 'author': None, 'canonical_url': 'https://www.madewithlau.com/recipes/siu-yuk-crispy-pork-belly', 'category': None, 'host': 'madewithlau.com', 'image': None, 'ingredient_groups': [{'ingredients': [], 'purpose': None}], 'ingredients': [], 'instructions': '', 'instructions_list': [], 'site_name': None }`

mlduff commented 5 months ago

It looks like madewithlau is now using a tRPC endpoint to get the recipe data after page load. An example of this is https://www.madewithlau.com/api/trpc/recipe.bySlug?batch=1&input={%220%22:{%22json%22:{%22slug%22:%22salted-fish-chicken-fried-rice%22}}}

I think that this will require a rewrite of the scraper to address, I'm happy to do this in a similar manner to how I tackled bergamot in https://github.com/hhursev/recipe-scrapers/pull/1064\

Do these findings and approach sound reasonable @jayaddison ?

jayaddison commented 5 months ago

That seems fine to me @mlduff, yep - please note that I'm going to be a little slow responding to code reviews for a few days (there are a few existing pull requests that I have not yet reviewed).