Closed AnotherTwinkle closed 3 years ago
Okay, I was being dumb and didnt realize these "extensions" are not userside. Is there any way to extract text from a page not having the TextExtracts extension?
Yes, you are correct that the extensions mentioned in the documentation are on the mediawiki side and not something that the user can add. Therefore, there are a few options:
1) Make sure the site is a mediawiki site and not a generic "wiki"; it isn't common, in my experience, that a mediawiki site does not have the TextExtracts extension installed 2) Reach out to the owners of the site and see if they would install the extension (worth a shot!) 3) Or you can use the wiki markup language; you would have to do something like this:
from mediawiki import MediaWiki
wiki = MediaWiki(url="the-media-wiki-api-url")
page = wiki.page("My Page")
page.wikitext
It has formatting code (similar in nature to the markdown format) but it should have all the text.
Hope this is helpful. If you need more information I would likely need examples of the issue to be able to help more. If this resolves your issue, go ahead and close this ticket.
Thanks!
How exactly do I install the needed "extensions" for the code to work? I am trying to get the content of a fandom page and it's throwing me an error called "Unable to extract page content, The TextExtracts extension must be installed!"