sweble / sweble-wikitext

The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaWiki.
http://sweble.org/sites/swc-devel/develop-latest/tooling/sweble/sweble-wikitext
70 stars 27 forks source link

Example to parse <page> element of wikipedia article to plain text #33

Closed nmadhire closed 9 years ago

nmadhire commented 9 years ago

Hi,

Can you please point me to an example where it shows how to parse a page element or wikitext. The basic example just uses a dummy data and I didn't find anything else in the documentation as well. Any pointers to examples would be helpful to sweble.

Please let me know.

Thanks.

hannesd commented 9 years ago

What do you mean by "" Element? Are you referring to wikipedia dumps? In that case you should have a look at the project swc-example-dumpcruncher in sweble-wikitext.

Let me know if that helped or whether you were actually looking for something else.

nmadhire commented 9 years ago

Actually Page tag. The Wikipedia dumps has for each article. I didn't find any where about how to run the Wikipedia xml and get the plain text out of it. Is there any documentation or something which I can see which can tell me about classes information.

Thanks.

hannesd commented 9 years ago

I'm sorry, I'm still not sure I fully understand. As I said, try the project swc-example-dumpcruncher. That's what you need to extract the markup from wikipedia dumps.