hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.7k stars 521 forks source link

Thank you and a request for new site help #94

Closed gregwa1953 closed 4 years ago

gregwa1953 commented 4 years ago

First, thank you for such a wonderful product.

Secondly, following your examples, I was able to re-create a scraper for AllRecipes.com. Then I attempted to "roll my own" for TheSpruceEats.com. The exact page is https://www.thespruceeats.com/doner-kebab-recipe-4171703 .

Unfortunately this site, and many others that I would like to create scrapers for seem to use some strange (at least to me) encoding that embeds a "3D" in front of just about everything. For example, one would think that the below snipped would be easy to scrape for Total Time:

<li class=3D"loc total-time project-meta__total-time"> <span id=3D"meta-text_1-0" class=3D"comp meta-text"> <span class=3D"meta-textlabel">Total: <span class=3D"meta-textdata">50 mins

However, BeautifulSoup refuses to find the span class tags when the "3D" is in the way.

It seems interesting to me that when I use the Developer Tools in Google Chrome, the "3D" is no where to be found.

All other tags are the same way, with the exception of Title. So, I turn to the experts to get a working scraper for this site that I can use as a tutorial in order to add to both my knowledge and to be able to share more scrapers in the future.

Thank you, in advance, for your assistance.

Greg

gregwa1953 commented 4 years ago

Update: It seems that when Google Chrome is used to save the page, IT inserts the offending "3D" entries. If I use Firefox, it doesn't insert the "3D".

hhursev commented 4 years ago

Hey, I just went out to check out what is happening and didn't see the 3D issue you were talking about. (I don't use Google Chrome so this might be it).

I went ahead and did the scraper (I believe) so you can double check if it works here https://github.com/hhursev/recipe-scrapers/pull/95/files Whenever you give the PR a 👍 I'll merge it to master and bump version 😉 .

gregwa1953 commented 4 years ago

Thank you very much! I'll check it out and let you know. This is a GREAT learning process for me.

Greg

gregwa1953 commented 4 years ago

YES! It works wonderfully! Go ahead and merge and bump version.

Thank you again, so much!

hhursev commented 4 years ago

You are welcome 😄