.. image:: https://img.shields.io/github/stars/hhursev/recipe-scrapers?style=social :target: https://github.com/hhursev/recipe-scrapers/ :alt: Github .. image:: https://img.shields.io/pypi/v/recipe-scrapers.svg? :target: https://pypi.org/project/recipe-scrapers/ :alt: Version .. image:: https://img.shields.io/pypi/pyversions/recipe-scrapers :target: https://pypi.org/project/recipe-scrapers/ :alt: PyPI - Python Version .. image:: https://pepy.tech/badge/recipe-scrapers :target: https://pepy.tech/project/recipe-scrapers :alt: Downloads .. image:: https://github.com/hhursev/recipe-scrapers/workflows/unittests/badge.svg?branch=main :target: https://github.com/hhursev/recipe-scrapers/actions/ :alt: GitHub Actions Unittests .. image:: https://coveralls.io/repos/hhursev/recipe-scraper/badge.svg?branch=main&service=github :target: https://coveralls.io/github/hhursev/recipe-scraper?branch=main :alt: Coveralls .. image:: https://img.shields.io/github/license/hhursev/recipe-scrapers? :target: https://github.com/hhursev/recipe-scrapers/blob/main/LICENSE :alt: License .. image:: https://app.codacy.com/project/badge/Grade/3ee8da77aaa3475a8085ca22287dea89 :target: https://app.codacy.com/gh/hhursev/recipe-scrapers/dashboard :alt: Codacy Badge
A simple web scraping tool for recipe sites.
.. code:: shell
pip install recipe-scrapers
then:
.. code:: python
from recipe_scrapers import scrape_me
scraper = scrape_me('https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/')
# Q: What if the recipe site I want to extract information from is not listed below?
# A: You can give it a try with the wild_mode option! If there is Schema/Recipe available it will work just fine.
scraper = scrape_me('https://www.feastingathome.com/tomato-risotto/', wild_mode=True)
scraper.host()
scraper.title()
scraper.total_time()
scraper.image()
scraper.ingredients()
scraper.ingredient_groups()
scraper.instructions()
scraper.instructions_list()
scraper.yields()
scraper.to_json()
scraper.links()
scraper.nutrients() # not always available
scraper.canonical_url() # not always available
scraper.equipment() # not always available
scraper.cooking_method() # not always available
scraper.keywords() # not always available
scraper.dietary_restrictions() # not always available
You also have an option to scrape html-like content
.. code:: python
import requests
from recipe_scrapers import scrape_html
url = "https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/"
html = requests.get(url).content
scraper = scrape_html(html=html, org_url=url)
scraper.title()
scraper.total_time()
# etc...
Notes:
scraper.links()
returns a list of dictionaries containing all of the tag attributes. The attribute names are the dictionary keys.Some Python HTTP clients that you can use to retrieve HTML include requests <https://pypi.org/project/requests/>
and httpx <https://pypi.org/project/httpx/>
. Please refer to their documentation to find out what options (timeout configuration, proxy support, etc) are available.
https://101cookbooks.com/ <https://101cookbooks.com/>
_https://15gram.be <https://15gram.be>
_https://www.750g.com <https://www.750g.com>
_https://aberlehome.com/ <https://aberlehome.com>
_https://abuelascounter.com/ <https://abuelascounter.com>
_https://www.acouplecooks.com <https://acouplecooks.com/>
_https://addapinch.com/ <https://addapinch.com/>
_http://www.afghankitchenrecipes.com/ <http://www.afghankitchenrecipes.com/>
_https://aflavorjournal.com/ <https://aflavorjournal.com/>
_https://ah.nl/ <https://ah.nl/>
_https://akispetretzikis.com/ <https://akispetretzikis.com/>
_https://aldi.com.au/ <https://aldi.com.au/>
_https://alexandracooks.com/ <https://alexandracooks.com/>
_https://alittlebityummy.com/ <https://alittlebityummy.com/>
_https://allrecipes.com/ <https://allrecipes.com/>
_https://allthehealthythings.com/ <https://allthehealthythings.com/>
_https://alltommat.se/ <https://alltommat.se/>
_https://altonbrown.com/ <https://altonbrown.com/>
_https://amazingribs.com/ <https://amazingribs.com/>
_https://ambitiouskitchen.com/ <https://ambitiouskitchen.com>
_https://archanaskitchen.com/ <https://archanaskitchen.com/>
_https://www.argiro.gr/ <https://www.argiro.gr/>
_https://www.arla.se/ <https://www.arla.se/>
_https://www.atelierdeschefs.fr/ <https://www.atelierdeschefs.fr/>
_https://averiecooks.com/ <https://www.averiecooks.com/>
_https://www.bakels.com.au/ <https://www.bakels.com.au/>
_https://baking-sense.com/ <https://baking-sense.com/>
_https://bakingmischief.com/ <https://bakingmischief.com/>
_https://barefeetinthekitchen.com/ <https://barefeetinthekitchen.com/>
_https://barefootcontessa.com/ <https://barefootcontessa.com>
_https://bbc.com/ <https://bbc.com/food/recipes>
_
.co.uk <https://bbc.co.uk/food/recipes>
__https://bbcgoodfood.com/ <https://bbcgoodfood.com>
_https://bestrecipes.com.au/ <https://bestrecipes.com.au>
_https://bettybossi.ch/ <https://bettybossi.ch>
_https://bettycrocker.com/ <https://bettycrocker.com>
_https://biancazapatka.com/ <https://biancazapatka.com>
_https://bigoven.com/ <https://bigoven.com>
_https://blueapron.com/ <https://blueapron.com>
_https://bluejeanchef.com/ <https://bluejeanchef.com/>
_https://www.bodybuilding.com/ <https://www.bodybuilding.com/>
_https://bonappetit.com/ <https://bonappetit.com>
_https://bongeats.com/ <https://bongeats.com/>
_https://bowlofdelicious.com/ <https://bowlofdelicious.com/>
_https://breadtopia.com/ <https://breadtopia.com/>
_https://briceletbaklava.ch/ <https://briceletbaklava.ch/>
_https://budgetbytes.com/ <https://budgetbytes.com>
_https://cafedelites.com/ <https://cafedelites.com/>
_https://carlsbadcravings.com/ <https://carlsbadcravings.com/>
_https://castironketo.net/ <https://castironketo.net/>
_https://cdkitchen.com/ <https://cdkitchen.com/>
_https://chefkoch.de/ <https://chefkoch.de>
_https://www.chefnini.com/ <https://www.chefnini.com/>
_https://chefsavvy.com/ <https://chefsavvy.com/>
_https://claudia.abril.com.br/ <https://claudia.abril.com.br>
_https://closetcooking.com/ <https://closetcooking.com>
_https://comidinhasdochef.com/ <https://comidinhasdochef.com/>
_https://cook-talk.com/ <https://cook-talk.com/>
_https://cookeatshare.com/ <https://cookeatshare.com/>
_https://cookieandkate.com/ <https://cookieandkate.com/>
_https://cooking.nytimes.com/ <https://cooking.nytimes.com>
_https://cookingcircle.com/ <https://cookingcircle.com/>
_https://cookinglight.com/ <https://cookinglight.com/>
_https://cookpad.com/ <https://cookpad.com/>
_https://www.coop.se/ <https://www.coop.se/>
_https://copykat.com/ <https://copykat.com>
_https://www.costco.com/ <https://www.costco.com>
_https://countryliving.com/ <https://countryliving.com>
_https://creativecanning.com/ <https://creativecanning.com>
_https://cucchiaio.it/ <https://cucchiaio.it>
_https://cuisineaz.com/ <https://cuisineaz.com>
_https://cybercook.com.br/ <https://cybercook.com.br/>
_https://damndelicious.net/ <https://damndelicious.net/>
_https://www.davidlebovitz.com/ <https://www.davidlebovitz.com/>
_https://delish.com/ <https://delish.com>
_https://dinneratthezoo.com/ <https://dinneratthezoo.com>
_https://dinnerthendessert.com/ <https://dinnerthendessert.com/>
_https://dish.co.nz/ <https://dish.co.nz>
_https://domesticate-me.com/ <https://domesticate-me.com/>
_https://downshiftology.com/ <https://downshiftology.com/>
_https://www.dr.dk/ <https://www.dr.dk/>
_https://www.eatingbirdfood.com/ <https://www.eatingbirdfood.com>
_https://www.eatingwell.com/ <https://www.eatingwell.com>
_https://www.eatliverun.com/ <https://www.eatliverun.com/>
_https://eatsmarter.com/ <https://eatsmarter.com/>
_
.de <https://eatsmarter.de/>
__https://eattolerant.de/ <https://eattolerant.de/>
_https://www.eatwell101.com <https://www.eatwell101.com>
_https://eatwhattonight.com/ <https://eatwhattonight.com/>
_https://elavegan.com/ <https://elavegan.com/>
_https://emmikochteinfach.de/ <https://emmikochteinfach.de/>
_https://en.wikibooks.org/ <https://en.wikibooks.org>
_https://epicurious.com/ <https://epicurious.com>
_https://www.errenskitchen.com/ <https://www.errenskitchen.com/>
_https://ethanchlebowski.com/ <https://ethanchlebowski.com>
_https://www.evolvingtable.com/ <https://www.evolvingtable.com/>
_https://www.familyfoodonthetable.com/ <https://www.familyfoodonthetable.com/>
_https://www.farmhouseonboone.com/ <https://www.farmhouseonboone.com/>
_https://www.fattoincasadabenedetta.it/ <https://www.fattoincasadabenedetta.it/>
_https://felix.kitchen <https://felix.kitchen>
_https://fifteenspatulas.com/ <https://www.fifteenspatulas.com/>
_https://finedininglovers.com/ <https://www.finedininglovers.com>
_https://fitmencook.com/ <https://www.fitmencook.com>
_https://fitslowcookerqueen.com <https://fitslowcookerqueen.com/>
_https://food.com/ <https://www.food.com>
_https://food52.com/ <https://www.food52.com>
_https://foodandwine.com/ <https://www.foodandwine.com>
_https://foodfidelity.com/ <https://foodfidelity.com>
_https://foodnetwork.co.uk/ <https://www.foodnetwork.co.uk>
_
.com <https://www.foodnetwork.com>
__https://foodrepublic.com/ <https://foodrepublic.com>
_https://www.forksoverknives.com/ <https://www.forksoverknives.com/>
_https://forktospoon.com/ <https://forktospoon.com/>
_https://franzoesischkochen.de/ <https://franzoesischkochen.de/>
_https://www.gesund-aktiv.com/ <https://www.gesund-aktiv.com>
_https://gimmesomeoven.com/ <https://www.gimmesomeoven.com/>
_https://godt.no/ <https://godt.no/>
_https://gonnawantseconds.com/ <https://gonnawantseconds.com>
_https://goodfooddiscoveries.com/ <https://goodfooddiscoveries.com/>
_https://goodhousekeeping.com/ <https://www.goodhousekeeping.com/>
_https://gourmettraveller.com.au/ <https://gourmettraveller.com.au>
_https://gousto.co.uk/ <https://gousto.co.uk>
_https://www.grandfrais.com/ <https://www.grandfrais.com>
_https://greatbritishchefs.com/ <https://greatbritishchefs.com>
_https://grimgrains.com/ <https://grimgrains.com>
_http://www.grouprecipes.com/ <http://www.grouprecipes.com/>
_https://halfbakedharvest.com/ <https://www.halfbakedharvest.com/>
_https://handletheheat.com/ <https://handletheheat.com/>
_https://www.hassanchef.com/ <https://www.hassanchef.com/>
_https://headbangerskitchen.com/ <https://www.headbangerskitchen.com/>
_https://healthyeating.nhlbi.nih.gov/ <https://healthyeating.nhlbi.nih.gov>
_https://heatherchristo.com/ <https://heatherchristo.com/>
_https://www.heb.com/ <https://www.heb.com/recipe/landing>
_https://hellofresh.com/ <https://hellofresh.com>
_
.at <https://www.hellofresh.at/>
, .be <https://www.hellofresh.be/>
, .ca <https://www.hellofresh.ca/>
, .ch <https://www.hellofresh.ch/>
, .co.nz <https://www.hellofresh.co.nz/>
, .co.uk <https://hellofresh.co.uk>
, .com.au <https://www.hellofresh.com.au/>
, .de <https://www.hellofresh.de/>
, .dk <https://www.hellofresh.dk/>
, .es <https://www.hellofresh.es/>
, .fr <https://www.hellofresh.fr/>
, .ie <https://www.hellofresh.ie/>
, .it <https://www.hellofresh.it/>
, .lu <https://www.hellofresh.lu/>
, .nl <https://www.hellofresh.nl/>
, .no <https://www.hellofresh.no/>
, .se <https://www.hellofresh.se/>
__https://www.hersheyland.com/ <https://www.hersheyland.com/>
_https://www.homechef.com/ <https://www.homechef.com/>
_https://hostthetoast.com/ <https://hostthetoast.com/>
_https://www.ica.se/ <https://www.ica.se/>
_https://www.im-worthy.com/ <https://www.im-worthy.com>
_https://inbloombakery.com/ <https://inbloombakery.com/>
_https://indianhealthyrecipes.com <https://www.indianhealthyrecipes.com>
_https://www.innit.com/ <https://www.innit.com/>
_https://insanelygoodrecipes.com <https://insanelygoodrecipes.com/>
_https://inspiralized.com/ <https://inspiralized.com>
_https://izzycooking.com/ <https://izzycooking.com/>
_https://jamieoliver.com/ <https://jamieoliver.com>
_https://jimcooksfoodgood.com/ <https://jimcooksfoodgood.com/>
_https://www.jocooks.com/ <https://www.jocooks.com>
_https://joshuaweissman.com/ <https://joshuaweissman.com/>
_https://joyfoodsunshine.com/ <https://joyfoodsunshine.com>
_https://joythebaker.com/ <https://joythebaker.com>
_https://juliegoodwin.com.au/ <https://juliegoodwin.com.au>
_https://justataste.com/ <https://justataste.com>
_https://justbento.com/ <https://justbento.com>
_https://www.justonecookbook.com/ <https://www.justonecookbook.com>
_https://kennymcgovern.com/ <https://kennymcgovern.com>
_https://keukenliefde.nl/ <https://keukenliefde.nl>
_https://www.kingarthurbaking.com <https://www.kingarthurbaking.com>
_https://kitchenaid.com.au/ <https://kitchenaid.com.au/blogs/kitchenthusiast/tagged/blog-category-recipes>
_https://www.kitchensanctuary.com/ <https://www.kitchensanctuary.com>
_https://www.kitchenstories.com/ <https://www.kitchenstories.com>
_https://kochbar.de/ <https://kochbar.de>
_https://kochbucher.com/ <https://kochbucher.com/>
_http://koket.se/ <http://koket.se>
_https://kristineskitchenblog.com/ <https://kristineskitchenblog.com>
_https://kuchnia-domowa.pl/ <https://www.kuchnia-domowa.pl/>
_https://kuchynalidla.sk/ <https://www.kuchynalidla.sk/>
_https://www.kwestiasmaku.com/ <https://www.kwestiasmaku.com/>
_https://www.latelierderoxane.com <https://www.latelierderoxane.com/blog/recettes/>
_https://leanandgreenrecipes.net <https://leanandgreenrecipes.net>
_https://www.lecker.de <https://www.lecker.de/rezepte>
_https://lecremedelacrumb.com/ <https://lecremedelacrumb.com/>
_https://lekkerensimpel.com <https://lekkerensimpel.com>
_https://leukerecepten.nl/ <https://www.leukerecepten.nl>
_https://lifestyleofafoodie.com <https://lifestyleofafoodie.com>
_https://littlespicejar.com/ <https://littlespicejar.com>
_http://livelytable.com/ <http://livelytable.com/>
_https://lovingitvegan.com/ <https://lovingitvegan.com/>
_https://www.maangchi.com <https://www.maangchi.com>
_https://madensverden.dk/ <https://madensverden.dk/>
_https://www.madewithlau.com/ <https://www.madewithlau.com/>
_https://madsvin.com/ <https://madsvin.com/>
_https://marleyspoon.com/ <https://marleyspoon.com/>
_
.at <https://marleyspoon.at/>
, .be <https://marleyspoon.be/>
, .com.au <https://marleyspoon.com.au/>
, .de <https://marleyspoon.de/>
, .nl <https://marleyspoon.nl/>
, .se <https://marleyspoon.se/>
https://marmiton.org/ <https://marmiton.org/>
_https://www.marthastewart.com/ <https://www.marthastewart.com/>
_https://matprat.no/ <https://matprat.no/>
_https://www.mccormick.com/ <https://www.mccormick.com/>
_https://meljoulwan.com/ <https://meljoulwan.com/>
_https://www.melskitchencafe.com/ <https://www.melskitchencafe.com/>
_http://mindmegette.hu/ <http://mindmegette.hu/>
_https://minimalistbaker.com/ <https://minimalistbaker.com/>
_https://ministryofcurry.com/ <https://ministryofcurry.com/>
_https://misya.info/ <https://misya.info>
_https://www.mob.co.uk/ <https://www.mob.co.uk/>
_https://mobile.kptncook.com/ <https://mobile.kptncook.com/>
_https://mobkitchen.co.uk/ <https://mobkitchen.co.uk/>
_https://www.modernhoney.com/ <https://www.modernhoney.com/>
_https://www.momontimeout.com/ <https://www.momontimeout.com/>
_https://momswithcrockpots.com/ <https://momswithcrockpots.com>
_https://monsieur-cuisine.com/ <https://monsieur-cuisine.com>
_http://motherthyme.com/ <http://motherthyme.com/>
_https://www.moulinex.fr/ <https://www.moulinex.fr/>
_https://www.mundodereceitasbimby.com.pt/ <https://www.mundodereceitasbimby.com.pt/>
_https://mybakingaddiction.com/ <https://mybakingaddiction.com>
_https://myjewishlearning.com/ <https://myjewishlearning.com>
_https://mykitchen101.com/ <https://mykitchen101.com>
_https://mykitchen101en.com/ <https://mykitchen101en.com>
_https://mykoreankitchen.com/ <https://mykoreankitchen.com>
_https://www.myplate.gov/ <https://www.myplate.gov/>
_https://myrecipes.com/ <https://myrecipes.com>
_https://www.nhs.uk/healthier-families/ <https://www.nhs.uk/healthier-families/>
_https://nibbledish.com/ <https://nibbledish.com>
_https://norecipes.com/ <https://norecipes.com/>
_https://www.notenoughcinnamon.com/ <https://www.notenoughcinnamon.com/>
_https://nourishedbynutrition.com/ <https://nourishedbynutrition.com/>
_https://www.nrk.no/ <https://www.nrk.no/>
_https://www.number-2-pencil.com/ <https://www.number-2-pencil.com/>
_https://nutritionbynathalie.com/blog <https://nutritionbynathalie.com/blog>
_https://nutritionfacts.org/ <https://nutritionfacts.org/>
_https://ohsheglows.com/ <https://ohsheglows.com>
_https://omnivorescookbook.com <https://omnivorescookbook.com>
_https://www.onceuponachef.com <https://www.onceuponachef.com>
_https://onesweetappetite.com/ <https://onesweetappetite.com>
_https://owen-han.com/ <https://owen-han.com>
_https://www.paleorunningmomma.com/ <https://www.paleorunningmomma.com>
_https://www.panelinha.com.br/ <https://www.panelinha.com.br>
_https://paninihappy.com/ <https://paninihappy.com>
_https://www.persnicketyplates.com/ <https://www.persnicketyplates.com/>
_https://www.pickuplimes.com/ <https://www.pickuplimes.com/>
_https://pinchofyum.com/ <https://pinchofyum.com/>
_https://www.pingodoce.pt/ <https://www.pingodoce.pt>
_https://pinkowlkitchen.com/ <https://pinkowlkitchen.com/>
_https://www.platingpixels.com/ <https://www.platingpixels.com/>
_https://plowingthroughlife.com/ <https://plowingthroughlife.com/>
_https://popsugar.com/ <https://popsugar.com>
_https://potatorolls.com/ <https://potatorolls.com/>
_https://practicalselfreliance.com/ <https://practicalselfreliance.com>
_https://pressureluckcooking.com/ <https://pressureluckcooking.com/>
_https://www.primaledgehealth.com/ <https://www.primaledgehealth.com/>
_https://www.projectgezond.nl/ <https://www.projectgezond.nl/>
_https://przepisy.pl/ <https://przepisy.pl>
_https://purelypope.com/ <https://purelypope.com>
_https://purplecarrot.com/ <https://purplecarrot.com>
_https://rachlmansfield.com/ <https://rachlmansfield.com>
_https://rainbowplantlife.com/ <https://rainbowplantlife.com/>
_https://realfood.tesco.com/ <https://realfood.tesco.com>
_https://realsimple.com/ <https://www.realsimple.com>
_https://receitas.globo.com/ <https://www.receitas.globo.com/>
_https://receitas.ig.com.br/ <https://receitas.ig.com.br>
_https://www.receitasnestle.com.br <https://www.receitasnestle.com.br>
_https://recept.se/ <https://recept.se/>
_https://www.recipegirl.com/ <https://www.recipegirl.com/>
_https://reciperunner.com/ <https://www.reciperunner.com>
_https://recipes.farmhousedelivery.com/ <https://recipes.farmhousedelivery.com/>
_https://recipes.timesofindia.com/ <https://recipes.timesofindia.com/>
_https://recipetineats.com/ <https://www.recipetineats.com/>
_https://redhousespice.com/ <https://redhousespice.com/>
_https://reishunger.de/ <https://www.reishunger.de/>
_https://rezeptwelt.de/ <https://rezeptwelt.de>
_https://ricetta.it/ <https://ricetta.it>
_https://ricette.giallozafferano.it/ <https://ricette.giallozafferano.it>
_https://www.ricetteperbimby.it/ <https://www.ricetteperbimby.it/>
_https://rosannapansino.com <https://rosannapansino.com>
_https://rutgerbakt.nl/ <https://rutgerbakt.nl/>
_https://www.saboresajinomoto.com.br/ <https://www.saboresajinomoto.com.br/>
_https://sallys-blog.de <https://sallys-blog.de/>
_https://sallysbakingaddiction.com <https://sallysbakingaddiction.com/>
_https://saltpepperskillet.com/ <https://saltpepperskillet.com/>
_https://www.saveur.com/ <https://www.saveur.com/>
_https://www.savorynothings.com/ <https://www.savorynothings.com/>
_https://seriouseats.com/ <https://seriouseats.com>
_https://sharing.kptncook.com/ <https://sharing.kptncook.com/>
_https://simple-veganista.com/ <https://simple-veganista.com/>
_https://simply-cookit.com/ <https://simply-cookit.com>
_https://simplyquinoa.com/ <https://simplyquinoa.com>
_https://simplyrecipes.com/ <https://simplyrecipes.com>
_https://simplywhisked.com/ <https://simplywhisked.com>
_https://skinnytaste.com/ <https://www.skinnytaste.com>
_https://smulweb.nl/ <https://smulweb.nl>
_https://sobors.hu/ <https://sobors.hu>
_https://www.southerncastiron.com/ <https://www.southerncastiron.com>
_https://southernliving.com/ <https://southernliving.com/>
_https://spendwithpennies.com/ <https://spendwithpennies.com/>
_https://www.springlane.de <https://www.springlane.de>
_https://www.staysnatched.com/ <https://www.staysnatched.com/>
_https://steamykitchen.com/ <https://steamykitchen.com>
_https://streetkitchen.hu/ <https://streetkitchen.hu>
_https://www.strongrfastr.com <https://www.strongrfastr.com>
_https://sunbasket.com/ <https://sunbasket.com>
_https://sundpaabudget.dk/ <https://sundpaabudget.dk>
_https://www.sunset.com/ <https://www.sunset.com/>
_https://sweetcsdesigns.com/ <https://www.sweetcsdesigns.com/>
_https://sweetpeasandsaffron.com/ <https://sweetpeasandsaffron.com/>
_https://www.taste.com.au/ <https://www.taste.com.au/>
_https://www.tasteatlas.com/ <https://www.tasteatlas.com/>
_https://tasteofhome.com <https://tasteofhome.com>
_https://tastesbetterfromscratch.com <https://tastesbetterfromscratch.com>
_https://tastesoflizzyt.com <https://tastesoflizzyt.com>
_https://tasty.co <https://tasty.co>
_https://tastykitchen.com/ <https://tastykitchen.com>
_https://theclevercarrot.com/ <https://theclevercarrot.com>
_https://www.thecookierookie.com/ <https://www.thecookierookie.com/>
_https://thecookingguy.com/ <https://thecookingguy.com>
_https://theexpertguides.com/ <https://theexpertguides.com>
_https://thehappyfoodie.co.uk/ <https://thehappyfoodie.co.uk>
_https://thekitchencommunity.org/ <https://thekitchencommunity.org/>
_https://www.thekitchenmagpie.com/ <https://www.thekitchenmagpie.com>
_https://thekitchn.com/ <https://thekitchn.com/>
_https://www.themagicalslowcooker.com/ <https://www.themagicalslowcooker.com/>
_https://themodernproper.com/ <https://themodernproper.com/>
_https://www.thepalatablelife.com <https://www.thepalatablelife.com/>
_https://thepioneerwoman.com/ <https://thepioneerwoman.com>
_https://therecipecritic.com/ <https://therecipecritic.com>
_https://thesaltymarshmallow.com/ <https://thesaltymarshmallow.com/>
_https://thespruceeats.com/ <https://thespruceeats.com/>
_https://thevintagemixer.com/ <https://thevintagemixer.com>
_https://thewoksoflife.com/ <https://thewoksoflife.com/>
_https://thinlicious.com/ <https://thinlicious.com/>
_https://tidymom.net <https://tidymom.net>
_https://tine.no/ <https://tine.no>
_https://tofoo.co.uk <https://tofoo.co.uk>
_https://tudogostoso.com.br/ <https://www.tudogostoso.com.br/>
_https://twopeasandtheirpod.com/ <http://twopeasandtheirpod.com>
_https://uitpaulineskeuken.nl/ <https://uitpaulineskeuken.nl>
_https://unsophisticook.com/ <https://unsophisticook.com/>
_https://usapears.org/ <https://usapears.org>
_https://www.valdemarsro.dk/ <https://www.valdemarsro.dk/>
_https://vanillaandbean.com/ <https://vanillaandbean.com>
_https://www.vegetarbloggen.no/ <https://www.vegetarbloggen.no/>
_https://vegolosi.it/ <https://vegolosi.it>
_https://vegrecipesofindia.com/ <https://www.vegrecipesofindia.com/>
_https://www.waitrose.com/ <https://www.waitrose.com/>
_https://watchwhatueat.com/ <https://watchwhatueat.com/>
_https://wearenotmartha.com/ <https://wearenotmartha.com/>
_https://www.weightwatchers.com/ <https://www.weightwatchers.com/>
_ (*)https://www.wellplated.com/ <https://www.wellplated.com/>
_https://whatsgabycooking.com/ <https://whatsgabycooking.com>
_https://whole30.com/ <https://whole30.com/>
_https://www.wholefoodsmarket.com/ <https://www.wholefoodsmarket.com/>
_
.co.uk <https://www.wholefoodsmarket.co.uk/>
__https://www.williams-sonoma.com/ <https://www.williams-sonoma.com/>
_https://womensweeklyfood.com.au/ <https://womensweeklyfood.com.au/>
_https://woolworths.com.au/shop/recipes <https://www.woolworths.com.au/shop/recipes/>
_https://woop.co.nz/ <https://woop.co.nz/>
_https://yemek.com/ <https://yemek.com>
_https://yummly.com/ <https://yummly.com>
_ (*)https://www.zaubertopf.de <https://www.zaubertopf.de>
_https://zeit.de/ (wochenmarkt) <https://www.zeit.de/zeit-magazin/wochenmarkt/index>
_https://zenbelly.com/ <https://zenbelly.com>
_(*) offline saved files only
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.
If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer. You can find documentation on how to develop scrapers here <https://github.com/hhursev/recipe-scrapers/blob/main/docs/README.md>
__.
Issue <https://github.com/hhursev/recipe-scraper/issues/new>
_ providing us the site name, as well as a recipe link from it.You are a developer and want to code the scraper on your own:
Schema is available <#faq>
on the site - you can go like this. <https://github.com/hhursev/recipe-scrapers/pull/176>
like this <https://github.com/hhursev/recipe-scrapers/commit/ffee963d04>
_.. code:: shell
python generate.py <ClassName> <URL>
test_data
to be used with the test class.You can find a more detailed guide here <https://github.com/hhursev/recipe-scrapers/blob/main/docs/how-to-develop-scraper.md>
__.
Assuming you have >=python3.8
installed, navigate to the directory where you want this project to live in and drop these lines
.. code:: shell
git clone git@github.com:hhursev/recipe-scrapers.git &&
cd recipe-scrapers &&
python -m venv .venv &&
source .venv/bin/activate &&
python -m pip install --upgrade pip &&
pip install -r requirements-dev.txt &&
pip install pre-commit &&
pre-commit install &&
python -m unittest
In case you want to run a single unittest for a newly developed scraper
.. code:: shell
python -m unittest -k <test_file_name>
.. code:: python
from recipe_scrapers import scrape_me
scraper = scrape_me('<url of a recipe from the site>', wild_mode=True)
# if no error is raised - there's schema available:
scraper.title()
scraper.instructions() # etc.
If you're using this library to collect large numbers of recipes from the web, please use the software responsibly and try to avoid creating high volumes of network traffic.
Python's standard library provides a robots.txt
parser <https://docs.python.org/3/library/urllib.robotparser.html>
_ that may be helpful to automatically follow common instructions specified by websites for web crawlers.
Another parser option -- particularly if you find that many web requests from urllib.robotparser
are blocked -- is the robotexclusionrulesparser <https://pypi.org/project/robotexclusionrulesparser/>
_ library.
All the contributors that helped improving <https://github.com/hhursev/recipe-scrapers/graphs/contributors>
_ the package. You are awesome!
.. image:: https://contrib.rocks/image?repo=hhursev/recipe-scrapers :target: https://github.com/hhursev/recipe-scrapers/graphs/contributors
| You want to gather recipes data?
| You have an idea you want to implement?
| Check out our "Share a project" wall <https://github.com/hhursev/recipe-scrapers/issues/9>
_ - it may save you time and spark ideas!