Closed kelson42 closed 2 months ago
This issue seems to be in fact impacting all stackexchange. At least all new tasks seems to be failing. I'm investigating.
Looks like issue is linked to the fact that XML dumps are stored in UTF-16-LE while most code seems to expect UTF-8 files.
@rgaudin does it ring any bell in your memory?
No ; what's happening exactly? Nothing gets parsed at all?
Yup, not parsed at all. Reencoding allows to go a little bit further but still many issues to fix. Obviously SO dumper has been updated + there are "maybe" too many magic values in sotoki ^^
Unable to scrape 3dprinting https://farm.openzim.org/recipes/3dprinting.stackexchange.com_en