Closed bredowmax closed 1 year ago
I just submitted pull request for sunbasket.
Did somebody look at dinnerly.com by any chance ? I have the impression that they are loading the recepie data via Javascript. When I look at the sourcecode in the browser or try the
python3 generate.py Dinnerly
command the sourcecode doesn't contain any recipe data. Any idea on how to get arround that ?
@webbastelbude Your suspicion is correct on it, Dinnerly, loading recipe data using Javascript.
I used dryscrape
to render the page and was able to get the content. The problem with that is that it requires Qt and QtWebKit. Apparently, QtWebKit is End of Life, which mean dryscrape
is not maintained. (The tutorial at the bottom of this tutorial, told me about dryscape
.) The resulting HTML isn't schema.org/Recipe format. This could still be useful for a first draft of the driver.
I tried a few more similar libraries, but did not get results. Any python package that renders javascript will most likely have some pretty heavy requirements.
thespruceeats.com already has a parser, but when Javascript is ran it populates LD+JSON in the schema.
Here is the code for dryscrape
in case you want to test this out with any URL that you suspect uses Javascript datahiding. It will write a file out.html
import dryscrape
url = "https://www.thespruceeats.com/wonton-soup-5074586"
sess = dryscrape.Session()
sess.visit(url)
source = sess.body()
with open("out.html", "w") as fp:
fp.write(source)
Homechef.com was added in #512. Marleyspoon.com was added in #534
I submitted #535 for everyplate.com.
I submitted #578 for Chef's Plate - didn't realize there was already an issue specifically for mealkits. Maybe it can be added to this list.
I love that you support HelloFresh! There are quite a few other meal kits that also expose their recipes - I would love if you could include them!
Example recipes from meal kits include: https://www.purplecarrot.com/recipe/roasted-cauliflower-lentil-bowl-with-avocado-curried-balsamic-vinaigrette https://cdn2.greenchef.com/uploaded/5f08a68ef6ec4700147ba8a1.pdf https://marleyspoon.com/menu/52491-lemon-herb-chicken-with-garlicky-yogurt-green-beans https://www.blueapron.com/recipes/bbq-chickpeas-farro-with-corn-cucumbers-hard-boiled-eggs-3 https://www.everyplate.com/recipes/garlic-rosemary-chicken-5efde75bfff7c66c36680eca?week=2020-W31 https://www.homechef.com/meals/steak-and-bacon-blue-cheese-butter https://dinnerly.com/menu/50620-skillet-ravioli-lasagna-with-mozzarella-parmesan https://sunbasket.com/protein/boneless-skinless-chicken-breast-strips
--- EDIT INTO CHECKLIST ---
[ ] everyplate.com(request now tracked in #535)