attardi / wikiextractor

A tool for extracting plain text from Wikipedia dumps
GNU Affero General Public License v3.0
3.69k stars 959 forks source link

how to get mention/anchor by wikiextractor. #286

Open lshowway opened 2 years ago

lshowway commented 2 years ago

@attardi @gojomo @xiaoling
Thanks for your work! I want to extract the corpus in the format: title, title's description, all mentions/anchors, and their positions and associated entities. So, could I get this format by extractor? How to make it?