Closed yzqzss closed 1 year ago
Known issue: Newer versions of MediaWiki seem to have changed the behavior of Special:Filelist
's handling of the offset parameter. The trick(offset = "29990101000000"
) that we reverse traversal of the Special:Filelist
no longer works.
Considering that fetching images via index.php is just a fallback for unavailable API, and because of our limit = 5000
, this bug should only affect wikis that have unavailable API and host more than 5000 media files.
How to reproduce:
Chckout this PR and set
limit
parameter to 1 (here: https://github.com/elsiehupp/wikiteam3/blob/eb1529a4c18ec3d71485aea3351330f6a52cdae7/wikiteam3/dumpgenerator/dump/image/image.py#L210)
dumpgenerator --index <index.php URL> --images
NOTE:
There is no problem with this PR itself, and it can be merged normally.
https://asoiaf.fandom.com/ gets "We couldn't find an English wiki at this URL, but here are related wikis in other languages" and so on.
The key commit is: https://github.com/mediawiki-client-tools/mediawiki-scraper/pull/156/commits/dce334b3ee68df1d9cd5a1155814f17f4221255e