Open RavanJAltaie opened 9 months ago
The error you mention seems to be a transient issue with our S3 storage (used to cache some files). If you look at previous tasks, it mention various articles which fails to download:
I did not add more courage to go more into the history.
Looks like many articles are not returning properly from the API.
@Popolechien @kelson42 : should we spend time to identify a list of failing articles to exclude (like I did for pokemon_fandom_en_all), hope some change in the scraper/server will change things or just say it will never work for this website?
it mention various articles which fails to download
Question: you only mention one article per task - does this mean that each task failed because of a single entry and that this entry was seemingly random? If yes, then how would building a list from past errors guarantee that future tasks based on this yet-not-understood behaviour will not fail as well?
Scraper stops on first failing article. And ordering of articles fetch is random (two consecutive try do not give same order). Try and retry until it works would provide the full list of failing articles.
(Try, add failing title to ignore list, retry to be exact)
Ok so we know that no matter the order, a given article would always fail. Then yeah, worth a try, but if this is entirely manual work won't this be extremely tedious? I'm tempted to suggest we discuss this at the next team meeting, just in case I'm missing on some info/context/background.
As per this issue, please check why this recipe is failing while we have already a successful resulted file in the past.
The link to the recipe on the Zimfarm: https://farm.openzim.org/recipes/all_the_tropes
The link to last working ZIM (if any): https://download.kiwix.org/zim/other/allthetropes_en_all_maxi_2020-10.zim
The error I get: Unable to connect to S3, either S3 login credentials are wrong or bucket cannot be found