Open jrbray1 opened 4 years ago
jrbray1, 09/08/20 14:00:
it would seem easier if dumpgenerator did this for you. Have you considered doing this?
It might seem easier, but there are infinite possible namespace names, plus a dozen core ones each potentially translated in 400 languages. The name of the namespace makes sense only after we've contacted the API, or (even worse) screenscraped index.php output. The results will be unavoidably unpredictable, which is going to confuse people even more unless they're already well-versed in MediaWiki internals.
In other words, it's not clear to me what kind of user would be served by such a feature. We'll consider it if someone sends a patch, though!
Not sure why the variety of namespace names is a problem, as https://www.mediawiki.org/wiki/Help:Namespaces talks about canonical namespaces in English and their foreign mappings. You could just support those canonical names and allow requests for 'Template' and 'Category'. This seems more robust that the user providing 10,14 and expecting that mapping is in place, but it would be just as easy with api parsing to allow a Frenchman to request --namespaces 0,Utilisateur, and not have to burrow into the api output to check what number that was.
Mediawiki documentation is all about names, not numbers.
I was hoping to do a backup of a wiki with content, Category and Template namespaces only, to reduce size, and select the namespaces by keyword, something like
dumpgenerator.py --api=https://hornblower.fandom.com/api.php --xml --curonly --namespaces 0,Template,Category
But it expects numbers not names. I could hack something together by parsing the results of https://hornblower.fandom.com/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2, but it would seem easier if dumpgenerator did this for you. Have you considered doing this?