relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

(URGENT) Guard against scrape failure + other flavors #167

Open ronaldtse opened 3 months ago

ronaldtse commented 3 months ago

As reported in #166 , in the relaton-data-iso dataset, all titles are now missing.

According to @andrew2net , this started happening on May 24 and we did not know until May 29 (today).

The page template has been changed.

The scraper must fail when data is missing, and do not commit anything to our dataset when the data is clearly broken or the template has changed.

andrew2net commented 1 month ago

@ronaldtse with the relaton-loger we can implement addition log channels. So I'm going to design a log channel that will create and an issue in the relation-data-iso repo in case any error happens while the documents are fetching. The issue will contain a message with all unique errors listed in it. All the subscribers will receive a new issue notification. Is that ok?

ronaldtse commented 2 weeks ago

@ronaldtse with the relaton-loger we can implement addition log channels. So I'm going to design a log channel that will create and an issue in the relation-data-iso repo in case any error happens while the documents are fetching. The issue will contain a message with all unique errors listed in it. All the subscribers will receive a new issue notification. Is that ok?

Of course, that's a good idea. Thanks!