hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.75k stars 534 forks source link

www.finedininglovers.com returns empty strings #366

Closed smilerz closed 2 years ago

smilerz commented 3 years ago

Pre-filing checks

The URL of the recipe(s) that are not being scraped correctly

The version of Python you're using

3.9.1

The operating system of your environment

Windows 10

The results you expect to see

non-empty results

The results (including any Python error messages) that you are seeing scrape.schema.data returns {}

Can you write Python and would you like to help fix the scraper yourself? We'd be glad for your assistance! We can provide you with guidance and code review in return. If so, tick any of the relevant boxes below:

bfcarpio commented 3 years ago

Thank you @smilerz for notifying us. I did some quick testing and came up with some additional notes:

It seems like this website might have updated.

vabene1111 commented 2 years ago

ok it appears that the above link is an "article" not a recipe (see url schema).

articles dont have many fields recipes have, i will add some try blocks so that articles provide as much data as possible.

PR will follow tomorrow.

jayaddison commented 2 years ago

Nice, thank you for looking into this @vabene1111. Pages that have type Article seem to be a fairly frequent issue.

I have one or two suggestions for the code based on f883dcc25a1d189ee2ed1deec1ab4e8c8cc4e6fa - let me know if you'd appreciate any early feedback, or I can wait and provide commentary when the pull request is ready.

vabene1111 commented 2 years ago

hey @jayaddison, thanks for the feedback. I would love to get feedback on the code, I made many assumptions because I wasn't sure on how you like to handle things.

For example I dont really know if you prefer to have functions raise exceptions or if issues should be handled gracefully for example by returning 0 for a yield or servings. Let me know whatever needs to be changed or how you prefer things.

jayaddison commented 2 years ago

Cool - yep, we should document these things as developer guidance.

The items I'm thinking about from a first-pass are:

(also a tiny nitpick: typo here)

vabene1111 commented 2 years ago

awesome, thanks for the feedback. All of that makes perfect sense and i will change everything accordingly!

jayaddison commented 2 years ago

Hey @smilerz - @vabene1111 took a look at this, and unfortunately my response after learning more about the issue is that I'm not convinced that this category of articles on finedininglovers is practical to support. Maybe that's misguided, I'm not sure - feel free to disagree; #612 includes some discussion and the conclusion so far.

At the moment I feel like there's a bunch of work to do to simplify and streamline the scraper development process (and maintenance and review process). It's not bad at the moment but it'd be nice to support a more rapid stream of changes and contributions with higher quality and testing -- and my sense is that doing that is going to require restructuring the internals of the library a bit.

Which means I'm trying harder (but not always managing) to catch changes that might make it harder to do that restructuring.

Closing for now, but further feedback and help appreciated!