infolab-csail / wikipedia-mirror

Makefiles that will download and setup a local wikipedia instance.
1 stars 2 forks source link

TextExtracts extensions #9

Closed fakedrake closed 10 years ago

fakedrake commented 10 years ago

Apparently wikipedia uses this extension that extracts limited html of page content. Maybe a fetcher could take advantage of this.

fakedrake commented 10 years ago

Not that useful if you have access to the database itself