knowitall / openie-demo

The main Open IE demo.
http://openie.cs.washington.edu/
6 stars 1 forks source link

Answers can span multiple entities #16

Open schmmd opened 12 years ago

schmmd commented 12 years ago

In moving to the answer format, we now collapse together answers that actually apply to different entities. For example, if you search for (Clinton, ran for, *) you will get results for both entities Hilary Clinton and Bill Clinton. We may want to disambiguate this somehow in the UI, although I'm not sure how. This might tie in with the browser experience and providing information cards for query entities.

Another example is a query for (Kennedy, *, *). There are quite a few linked entities. IWith single-slot queries such as this, it seems particularly wrong to conflate entities (especially people and space stations).

John F. Kennedy --> 305
Robert F. Kennedy --> 77
Caroline Kennedy --> 23
Mr. Kennedy --> 15
John F. Kennedy assassination --> 8
Kennedy Space Center --> 6
Jacqueline Kennedy Onassis --> 4
Eunice Kennedy Shriver --> 3
Kennedy family --> 2
Adam Kennedy --> 2
Kennedy Center --> 2
John Kennedy Toole --> 2
Rory Kennedy --> 2
Randall Kennedy --> 2
Charles Kennedy --> 2
Donald Kennedy --> 2
Tyler Kennedy --> 2
Joseph Patrick Kennedy II --> 2
Leon S. Kennedy --> 1
Paul Kennedy --> 1
Jimmy Kennedy --> 1
Stetson Kennedy --> 1
Kathleen Kennedy Townsend --> 1
Arthur Kennedy (actor) --> 1
Jamie Kennedy --> 1
Patrick J. Kennedy --> 1
Rosemary Kennedy --> 1
USS John F. Kennedy (CV-67) --> 1
John Fitzgerald Kennedy National Historic Site --> 1
Anthony Kennedy --> 1
Alex Kennedy --> 1
William Kennedy Smith --> 1
Kennedy Krieger Institute --> 1
David Kennedy --> 1
Ted Kennedy (ice hockey) --> 1
John F. Kennedy International Airport --> 1
Ray Kennedy --> 1
schmmd commented 11 years ago

Here's a link to the Kennedy example: http://openie.cs.washington.edu/search/?arg1=kennedy&&&page=1 Twitter bootstrap might provide UI suggestions: http://twitter.github.io/bootstrap/index.html

schmmd commented 11 years ago

Here is an example from Oren: http://openie.cs.washington.edu/search?arg1=technion&rel=&arg2=&corpora=

wy1024 commented 11 years ago

This is my suggestion of fixing the multiple entries problem. First we start from the main page, which doesn't have to be modified.

demo-1

As we type in "Kennedy" for Arg1, instead of directing to the page that has the exact match of the key word, it is now directed to an intermediate page, which lists out all the different entries for the keyword.

The list is sorted by type, people, places and etc, and the list is sorted by the popularity, so that user can find what they want easier. For the same type, it only lists out 3 entries, any more than that is hidden, but can be listed through the "more" button.

For entries containing few results, say less than 3, they can be categorized into the others option, (with their matching types in brakets). demo-2

So when the user selects a specific Kennedy, for example JFK, it shows a exact match of that person, so there is no ambiguity. demo-3

If the user looks for arg1 and arg2, it can be presented in a two column page, or a better alternative may be to select the first Arg1, (like on page 2) and then select Arg2, so the page can be simpler and the development may be less complex. demo-4

Same as before, after the two specific args are selected, the exact match of the two args with show up. In this example, JFK and Hillary Clinton. demo-5

wy1024 commented 11 years ago

I have implemented a solution and its on the CSE rv-n16.

rv-n16:8000

The following is a description of how the flow would work.

wp_001327

  1. Starting from the index page, the user submits a query.
  2. Results based on that query are generated, I filter the results, so that each query only displays at most 7 entities, and each entity has more than 5 results. (I have chosen these numbers because at most cases, 7 answers to a single query is about enough, and entities with less than 5 results are usually not very relavent.)
  3. Based on the number of filtered results, there are 3 cases: 0 filtered entities, 1, and multiple.

3.1 When there is a single entity that matches the filtered query, it would go to the results page, with a search of the linked entity. i,e, search for "Hillary Clinton" would go to results page with a search of "entity: Hillary R. Clinton"

3.2 When there are multiple entities, it would go to the disambiguate page, which forces the user to select a specific entity they are looking for, or they can choose a general search for all. ie, search for clinton would display "Bill Clinton", "Hillary Clinton", etc, and a "search for all." Clicking on the entity goes to the linked result page, search for all simply searches for "Clinton".

3.3 The last case is when there is 0 entities after the filter, so either there are no results for the query, or the results for that entity is really small (less than 5). In this case, it goes to the results page directly, but not with a linked entity search, just a general search of the original query.


There are things I need to add to this. (These searches now performs the original searches)

  1. The disambiguate logic now only works for arg1, I need to add it for arg2 as well.
  2. When both arg1 and arg2 are in the query, just perform regular search instead of disambiguate for arg1 and arg2.
  3. When the query has a relation, ie, (Thomas Edison, invent,) the query "arg1" should be linked to (entity: thomas edison) to generate more accurate results.
  4. A problem is many of the results may not be linked, ie, (type:Country, is located in, entity:Africa), will give a lot less results than (type:Country, is located in, africa).
  5. Fix bug that shows up when passes an empty query.
  6. Add a link back to the disambiguate page
schmmd commented 11 years ago

We need to a little more functionality so we don't lose functionality.

  1. When a query is automatically linked (i.e. (Obama, *, *) turns into (entity:Barack Obama, *, *)), we need to be able to search for the unlinked entity.
  2. When an entity is disambiguated, we need a link back to the disambiguation page.