Jsonpedia will be used as a library to extract json representation from a wikipedia dump and the result will then be fed into the lucene index.
This index will be used to compute metrics on how homogeneous sections are with the goal of extracting data from them.
The ingestion process
Jsonpedia will be used as a library to extract json representation from a wikipedia dump and the result will then be fed into the lucene index. This index will be used to compute metrics on how homogeneous sections are with the goal of extracting data from them.
The index will have the following fields: