hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.6k stars 505 forks source link

Gousto.co.uk not parsing some ingredients correctly #1156

Open ThomasHFWright opened 6 days ago

ThomasHFWright commented 6 days ago

Pre-filing checks

The URL of the recipe(s) that are not being scraped correctly

...

For ingredients such as '1 320g British skinless chicken thighs' I'd expect this to come through as just '320g British skinless chicken thighs' With '1 1 tbsp cornflour' I'd expect this to come back as '1 tbsp cornflour'

Bit more complex, for '2 80g mangetout' I'd expect this to come back as "160g mangetout" if possible

...

With 14.56.0, scraper.ingredients() come back as: ['15ml rice vinegar sachet', '1 1 orange', '1 1 tbsp cornflour', '15g fresh root ginger', '20ml toasted sesame oil', '1 320g British skinless chicken thighs', '2 80g mangetout', '2 1 spring onion', '1 Brown rice', '2 1 garlic clove', '24ml soy sauce']

...

This is affecting importing recipes into Tandoor https://github.com/TandoorRecipes/recipes/issues/3194