mealie-recipes / mealie

Mealie is a self hosted recipe manager and meal planner with a RestAPI backend and a reactive frontend application built in Vue for a pleasant user experience for the whole family. Easily add recipes into your database by providing the url and mealie will automatically import the relevant data or add a family recipe with the UI editor
https://docs.mealie.io
GNU Affero General Public License v3.0
7.33k stars 732 forks source link

[SCRAPER] - Any way around 'Looks like we couldn't find anything' #3356

Closed g0d1k closed 7 months ago

g0d1k commented 7 months ago

First Check

Please provide 1-5 example URLs that are having errors

https://www.foodnetwork.com/recipes/warm-asparagus-salad-8377211?nl=ROTD_032224_featurecta&%24web_only=true&lid=39gai7jstowq&lvrmp=ce4929bc6bab26149478fad942af4773

Please provide your logs for the Mealie container docker logs <container-id> > mealie.logs

Looks Like We Couldn't Find Anything

Only websites containing ld+json or microdata can be imported by Mealie. Most major recipe websites support this data structure. If your site cannot be imported but there is json data in the log, please submit a github issue with the URL and data.

I was just curious if anyone has found a way around this error, I could always scrape recipes from food network but after migrating to a new server I now get this error when using a food network URL. Thanks.

Deployment

Docker (Linux)

CodeMan99 commented 7 months ago

Replicated Issue

I seem to be having the same issue with foodnetwork.

Example URL: https://www.foodnetwork.com/recipes/ree-drummond/chicken-taco-salad-2247959

Logs

The mealie container logs out a forbidden response from foodnetwork.

mealie  | INFO: 24-Mar-24 12:35:46      HTTP Request: GET https://www.foodnetwork.com/recipes/ree-drummond/chicken-taco-salad-2247959 "HTTP/1.1 403 Forbidden"
mealie  | ERROR: 24-Mar-24 12:35:46     Recipe Scraper was unable to extract a recipe from https://www.foodnetwork.com/recipes/ree-drummond/chicken-taco-salad-2247959
mealie  | INFO: 24-Mar-24 12:35:46      HTTP Request: GET https://www.foodnetwork.com/recipes/ree-drummond/chicken-taco-salad-2247959 "HTTP/1.1 403 Forbidden"
mealie  | INFO:     172.19.0.4:47432 - "POST /api/recipes/create-url HTTP/1.0" 400 Bad Request

Version

CodeMan99 commented 7 months ago

A very quick search landed me on dpapathanasiou's foodnetwork recipe scraper. Hopefully that's helpful.

CodeMan99 commented 7 months ago

Actually, this may be relevant: https://github.com/hhursev/recipe-scrapers/pull/1026

Kuchenpirat commented 7 months ago

Hey, i just checked your urls with the scraper directly:

https://www.foodnetwork.com/recipes/warm-asparagus-salad-8377211 does no longer lead to a recipe, so mealie as well as the recipe_scrapers are not able to get any data from that page. This is, as @CodeMan99 suggested propably connected to them changing their TLD from .com to .co.uk.

https://www.foodnetwork.com/recipes/ree-drummond/chicken-taco-salad-2247959 does return data with the scraper, but it is missing the instructions. I also see this replicated in mealie with additionally the image missing, because the foodnetwork is blocking the request to the image.

The returned data for reference:

{
  'author': 'Food Network UK', 
  'canonical_url': 'https://foodnetwork.co.uk/recipes/chicken-taco-salad', 
  'category': None, 
  'host': 'foodnetwork.co.uk', 
  'image': 'https://d2vsf1hynzxim7.cloudfront.net/production/media/13137/conversions/foodnetwork-image-40d00ede-bc1d-4076-b41e-e8a29a6d6a6f-default.webp', 
  'ingredient_groups': [{'ingredients': ['For the Chicken', '2 boneless, skinless chicken breasts', '2 tablespoons taco seasoning (store-bought or your own mix)', '1/4 cup vegetable oil', '2 tablespoons butter', 'For the Dressing', '3/4 cup ranch dressing (bottled is fine)', "1/4 cup salsa (as spicy as you'd like)", '3 tablespoons finely minced fresh cilantro', 'For the Salad', '2 ears corn, shucked', '1 large head or 2 regular heads green leaf lettuce, shredded thin', '3 Roma tomatoes, diced', '1/2 cup grated pepper-jack cheese', '2 avocados, diced', '3 spring onions, sliced', '1/2 cup fresh coriander leaves', 'Tortilla chips of your choice (flavored or not), crushed slightly, for topping salad'], 'purpose': None}], 
  'ingredients': ['For the Chicken', '2 boneless, skinless chicken breasts', '2 tablespoons taco seasoning (store-bought or your own mix)', '1/4 cup vegetable oil', '2 tablespoons butter', 'For the Dressing', '3/4 cup ranch dressing (bottled is fine)', "1/4 cup salsa (as spicy as you'd like)", '3 tablespoons finely minced fresh cilantro', 'For the Salad', '2 ears corn, shucked', '1 large head or 2 regular heads green leaf lettuce, shredded thin', '3 Roma tomatoes, diced', '1/2 cup grated pepper-jack cheese', '2 avocados, diced', '3 spring onions, sliced', '1/2 cup fresh coriander leaves', 'Tortilla chips of your choice (flavored or not), crushed slightly, for topping salad'], 
  'instructions': '', 
  'instructions_list': [], 
  'language': 'en', 
  'nutrients': {}, 
  'site_name': None, 
  'title': 'Chicken Taco Salad', 
  'total_time': 26, 
  'yields': '4 servings'
}

I'll go ahead and close this one over here, as it seems the people over at recipe-scrapers are already aware of this and the fix/update will land in mealie when they release a new version that includes the fixes for foodnetwork.co.uk