knowitall / openie-demo

The main Open IE demo.
http://openie.cs.washington.edu/
6 stars 1 forks source link

Add NELL corpus #9

Closed schmmd closed 11 years ago

schmmd commented 12 years ago

NELL provides extraction information that overlaps what we have. Although it is relatively small compared to UW Open IE, it's of high quality. There are two primary types of information in NELL:

  1. Triple-style relations (Bowie, is an actor in, Labrynth)
  2. Categorical relationships (Bowie (in) Actor)

The former fits easily into the demo since they can be added as extractions. The latter could be added as an extraction with an uninformative relation (Bowie, is, an Actor) or it could be used to enhance the Freebase cards.

schmmd commented 12 years ago

Some questions we need to answer:

  1. How many NELL triples are there?
  2. How many NELL hierarchical relationships are there (i.e. cat is an animal).
  3. How do the previous numbers compare to our demo? (might need to ask Rob for a figure)
  4. What linking information do we have?

These numbers are most helpful from our dumps, since that's the data we have access to.

schmmd commented 12 years ago

Where are we on this?

xdavidjung commented 12 years ago

I created a parser that takes in the NELL beliefs file and outputs serialized ReVerbExtraction objects. I handed that parser off to Rob, who has things already set up to take those ReVerbExtraction objects and add them to the corpus.

schmmd commented 12 years ago

Rob, I think this is in your hands now? How is this going?

schmmd commented 11 years ago

We stopped this thrust and only imported NELL types.