hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.66k stars 514 forks source link

mykitchen101 / mykitchen101en: total_time field is mandatory but is not implemented #953

Closed jayaddison closed 1 month ago

jayaddison commented 8 months ago

Pre-filing checks

The URL of the recipe(s) that are not being scraped correctly

The results you expect to see

I think that our unit test for this site should fail, because the mandatory total_time field isn't implemented for the scraper yet (the information is available on the page, although not in schema.org format).

It looks like this is true for both the EN-language and JA-language versions of the scraper.

The results (including any Python error messages) that you are seeing

Tests unexpectedly pass.

Edit: fixup: language-code / country-code mismatch

strangetom commented 8 months ago

I think I disagree that the test should fail in this case.

The way the test code works is that if the output from one of the mandatory functions is missing from the test json, it checks that an assertion is raised: https://github.com/hhursev/recipe-scrapers/blob/afc2cb067bf4627b3b98616d806b62557cc867a9/tests/__init__.py#L83-L96

In these cases, the test is testing that the scraper behaves correctly: if the function isn't implemented then an assertion is raised. (There is a separate question about whether the Exception in the above code needs to be more specific).

But I think your point still stands. There should be some sort of check to make sure that a scraper implements the mandatory functions. This feels to me like it belongs are part of the pull request review - perhaps it would be possible to include it as part of the automatic checks that run on a pull request?

jayaddison commented 8 months ago

Got it, OK. So the levels that could exist, and the levels that currently exist are:

Currently the mandatory test fields are the middle case -- it's mandatory for them to be callable.

(does that match your understanding?)

strangetom commented 8 months ago

Yep, that's right. With the added complication that the schemeorg_fill plugin will mean that the exceptions observed will actually be SchemaOrgException, rather than the NotImplementedError raised by the scraper class.

jayaddison commented 1 month ago

This bugreport was related to my misunderstanding of the way that mandatory fields are tested, and I think that the current logic is correct, so I'm going to close this wontfix/not-planned.