fictive-kin / openrecipes

An open database of recipe bookmarks
http://blog.fictivekin.com/post/46860403233/the-jokes-on-us
Other
592 stars 113 forks source link

tastykitchen data quality #214

Open famagusta opened 8 years ago

famagusta commented 8 years ago

I have noticed that the data for tasty kitchen has incorrectly crawled the ingredients attribute for most of the data.

e.g. (the first entry) {"description": "Empanadas made with homemade vegan dough, filled with melty mozzarella, fresh tomatoes and basil.", # "ingredients": "1-\u00be ounce, weight 1-\u00be ounce, weight\n2 cups 2 cups\n\u00bc teaspoons \u00bc teaspoons\n\u00bd cups \u00bd cups\n10 whole 10 whole\n5 ounces, weight 5 ounces, weight\n\u00bd cups \u00bd cups\n1 whole 1 whole", "url": "http://tastykitchen.com/recipes/appetizers-and-snacks/caprese-empanadas/", "image": "http://static.tastykitchen.com/recipes/files/2013/03/Caprese-Empanadas-by-clem-on-March-15-2013-410x273.jpg", "ts": {"$date": 1365299590596}, "datePublished": "2013-03-15", "source": "tastykitchen", "recipeYield": "10", "_id": {"$oid": "5160d18696cc620d2615320c"}, "cookTime": "PT20M", "prepTime": "PT15M", "name": "Caprese Empanadas"}

funkatron commented 8 years ago

Yes, that seems to be the case. Our time has been extremely limited to work on this, but if you have a chance to look into the crawler and identify the issue, that would help a lot.

On Thu, Feb 25, 2016 at 3:57 AM, famagusta notifications@github.com wrote:

I have noticed that the data for tasty kitchen has incorrectly crawled the ingredients attribute for most of the data.

e.g. (the first entry) {"description": "Empanadas made with homemade vegan dough, filled with melty mozzarella, fresh tomatoes and basil.", # "ingredients": "1-\u00be ounce, weight 1-\u00be ounce, weight\n2 cups 2 cups\n\u00bc teaspoons \u00bc teaspoons\n\u00bd cups \u00bd cups\n10 whole 10 whole\n5 ounces, weight 5 ounces, weight\n\u00bd cups \u00bd cups\n1 whole 1 whole", "url": " http://tastykitchen.com/recipes/appetizers-and-snacks/caprese-empanadas/", "image": " http://static.tastykitchen.com/recipes/files/2013/03/Caprese-Empanadas-by-clem-on-March-15-2013-410x273.jpg", "ts": {"$date": 1365299590596}, "datePublished": "2013-03-15", "source": "tastykitchen", "recipeYield": "10", "_id": {"$oid": "5160d18696cc620d2615320c"}, "cookTime": "PT20M", "prepTime": "PT15M", "name": "Caprese E mpanadas"}

— Reply to this email directly or view it on GitHub https://github.com/fictivekin/openrecipes/issues/214.

famagusta commented 8 years ago

Sure - will take a peek under the hood. Overall the project is a great resource.

famagusta commented 8 years ago

there is already a pull request with the solution