wpoa / recitation-bot

MediaWiki bot to upload content to Wikimedia projects and update corresponding citations on Wikipedia.
GNU General Public License v3.0
9 stars 3 forks source link

Strive for complete paper titles at Wikisource #15

Open Daniel-Mietchen opened 10 years ago

Daniel-Mietchen commented 10 years ago

There is a maximum length for page titles at MediaWiki - 255 bytes according to https://www.mediawiki.org/wiki/Manual:Page_table#page_title . At the OAMI, we have opted to take the first 100 characters of a paper title before we append portions of the DOI. This has worked fine so far.

For Wikisource, this is not the best approach, though, and I think we should try to accommodate as much of the article title in the page name. Example: https://en.wikisource.org/wiki/Wikisource:WikiProject_Open_Access/Programmatic_import_from_PubMed_Central/A_cladistically_based_reinterpretation_of_the_taxonomy_of_two_Afrotropical_tenebrionid_genera_Ectateus_Koch_1956_and_Selinus_Mulsant_%26_Rey_1853_%28 .

wrought commented 10 years ago

I believe currently the first 255 bytes are used here and the rest are truncated:

233         if len(self.wikisource_title) > 255:
234                 self.wikisource_title = self.wikisource_title[:255]

Does that work?

Daniel-Mietchen commented 9 years ago

I had put the 255 in there, and I think it works as expected. Not sure, though, how to handle cases where the title length exceeds that limit - perhaps best to move to a sensible page title manually.