nlpia / nlpia-bot

A virtual assistant that actually assists!
http://manning.com/books/natural-language-processing-in-action
Other
56 stars 21 forks source link

Semantic Matching Query to Wiki #27

Open KChalk opened 4 years ago

KChalk commented 4 years ago

Queries should be answered via the most strongly related documents.

  1. Currently:
    1. distance calculation between query and documents is implemented via ngrams
    2. learned embeddings representations of documents exist but are not used
  2. Next Steps:
    1. represent queries via learned embeddings (done)
    2. implement comparison of q and doc learned embeddings (brute force done)
    3. reduce computational expense of searching document representations
  3. Stretch:
    1. more complex document representations
    2. return n best docs (done) and search all for best answer
    3. improve models choices and embedding processes (move away from averaging)
KChalk commented 4 years ago

2.iii: agglomerative clustering could make for n-ary search.