dbpedia / jsonpedia-extractor

Fine grained massive extraction of Wiipedia content GSoC 2014 Project
6 stars 4 forks source link

Create a faceted lucene index #1

Closed gigaroby closed 10 years ago

gigaroby commented 10 years ago

The ingestion process

Jsonpedia will be used as a library to extract json representation from a wikipedia dump and the result will then be fed into the lucene index. This index will be used to compute metrics on how homogeneous sections are with the goal of extracting data from them.

The index will have the following fields: