usc-isi-i2 / dig-etl-engine

Download DIG to run on your laptop or server.
http://usc-isi-i2.github.io/dig/
MIT License
101 stars 39 forks source link

Custom data uploaded, simple extractors work, but use of more complex extractors unclear #271

Open cslovell opened 5 years ago

cslovell commented 5 years ago

Hi,

We've uploaded some custom data extracted from a set of webpages, attached below as a zipped .jl file.

output.zip

We've loaded this into DIG and added a date extractor along with title and descriptions. See below.

currentkg

We'd like to add or train custom extractors for things like cities, countries, and perhaps topics. We understand that we can do this through custom ETK modules, glossaries, and the spacy rules editor (which was used to derive the dates). However, our list of cities or countries failed, and we're stuck about how to develop and use ETK modules.

Would it be possible to provide an example of how we can add city extractors, country extractors and perhaps some custom things for topic? Also, how could we ingest data from excel files and tabular data? This would be immensely useful to us.

Thanks!