Open reaper47 opened 1 month ago
@reaper47 What help is needed regarding this issue? The issue mentions create a new migration but I assume this is connected to scraping websites using the Fetch from website function? Is there any documentation for how to help adding new websites (I want to remember I saw something around this somewhere, but can't find anything now when searching GitHub and looking through the documentation)?
The migration part of the issue enumerates websites recently supported that shall not be forgotten in the migration file: https://github.com/reaper47/recipya/blob/main/internal%2Fservices%2Fmigrations%2Ffor-release-v1.2.0
This file is where all newly-supported websites for v1.2.0 are dumped. The .sql extension will be added it before release for goose to pick it up.
The guide on how to support websites is here: https://recipes.musicavis.ca/guide/docs/development/workflow/import-website/. I need to update it a bit. What is left to do for this issue is to support the unchecked items above.
@reaper47 Oh, okay, so all websites mentioned in this ticket are already implemented as a supported website from a scraping perspective?
Note that the link https://recipes.musicavis.ca/guide/docs/development/workflow/ and its child pages are broken.
Thank you for pointing the demo being down. It was caused by a panic in internal\services\file_apps:L337.
All unchecked websites mentioned in this issue are to be verified/supported. Often times, websites already have the LD+JSON script tag with all the recipe's information so there's not much to do except add a unit test and add the website to the SQL. When this tag is not present, the "domain not implemented" error is returned and elements from the website need to be scraped manually. There are plenty of examples in the scraper package.
@reaper47 The documentation site seems to be down altogether right now.
Thank you for letting me know. Someone fetched a website that caused a panic:
Jun 26 17:19:17 musicavis recipya[1848782]: recipya/internal/scraper/websites.go:195 +0x4105
I will try to find the anomalous website from the logs.
Edit: It's one of these. I will fix it tomorrow.
https://www.maangchi.com/recipe/dububuchim-yangnyeomjang
https://www.maangchi.com/recipe/yangnyeom-gejang
https://www.maangchi.com/recipe/kkeopjilkong-maneul-bokkeum
https://www.maangchi.com/recipe/asparagus-muchim
https://www.maangchi.com/recipe/haemul-pajeon
https://www.maangchi.com/recipe/gamja-haem-bokkeum
https://www.maangchi.com/recipe/corn-cheese
https://www.maangchi.com/recipe/rose-tteokbokki
https://www.maangchi.com/recipe/oijangajji
https://www.maangchi.com/recipe/wanja-jeon
To be done right before release. Also check TODO comments.
Create a new migration with the following newly-supported websites:
Support these websites: