kiwix / kiwix-tools

Command line Kiwix tools: kiwix-serve, kiwix-manage, ...
https://download.kiwix.org/release/kiwix-tools/
GNU General Public License v3.0
435 stars 85 forks source link

kiwix-serve: retrieving articles via page id #399

Closed AiliAili closed 4 years ago

AiliAili commented 4 years ago

Hi there, Thank you for developing such a helpful tool. Currently, I am able to retrieve articles via page titles, such as http://localhost:8888/wikipedia_en_all_maxi_2020-06/A/Elephant. I am wondering whether it is possible to retrieve articles based on page_id.

Thank you.

kelson42 commented 4 years ago

"Elephant" is the pageid here. What is your problematic use case here?

AiliAili commented 4 years ago

Thank you for your timely reply.

I want to retrieve all articles via page id. Page titles (such as elephant) can change across time, but page id does not change across time, according to Wikipedia scheme. For example, a title can be "How_do_you_like_Wednesday?" at a timestamp. After a few months, the title is changed to "How_Do_You_Like_Wednesday?".

If I only know the previous title, it will be a failure to retrieve "How_do_you_like_Wednesday" from the server. However, both "How_do_you_like_Wednesday?" and "How_Do_You_Like_Wednesday?" share the same page id. If we can retrieve via page id, such a failure can be avoided.

Thank you.

kelson42 commented 4 years ago

Your question is really specific to Mediawiki based ZIM files. So far I understand you misunderstand how works Mediawiki. The only thing I can tell you is that MWoffliner keeps the articleid in the ZIM url and Mediawiki Redirects will become ZIM redirects. If you have a concrete example where this is not done, please open a ticket in openzim/mwoffliner.