HazyResearch / mindbender

Tools for iterative knowledge base development with DeepDive
116 stars 32 forks source link

Faceted Search in MBS: Support unified ES doc type so we can have faceted search #50

Open alldefector opened 8 years ago

alldefector commented 8 years ago

Currently MBS creates a separate ES doc type (equivalent to a DB table) for each source or extraction relation. Only one doc type can be searched / rendered at a time. There is no way for the user to perform the typical faceted search because each extraction relation is typically one facet (e.g., age, name, city). Facets do not even propagate across child-parent / FK links.

One possible work-around is to have a post-processing step in DD that joins the source table with all extraction tables (say using array_agg(extraction_value) by doc_id), and then define all the facets on this unified table.

Alternatively, we could add annotation support for the above use case. For example, @reference_inline would let the parent relation absorb all the navigable/searchable fields of the child relation. That would make ES mapping generation a bit less straightforward. For index creation, we could either do the join in SQL and populate ES in one pass or perform multi-pass ES updates (one source/extraction table per pass).

@netj @chrismre

alldefector commented 8 years ago

BTW, the post-processing approach was used by Vincent: https://github.com/infinitespace/memex-search/blob/master/app.ddlog#L202